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1.   INTRODUCTION 

The  name  "computer"  immediately  associates  the  digital  computer 
with  numerical  calculations.   However,  a  significant  portion  of  digital 
computer  time  is  used  for  non-numerical  tasks  such  as  string  processing. 
Probably  the  most  common  programs  that  use  string  processing  are  pro- 
gramming language  translators  or  compilers.   Since  the  string  processing 
operations  used  in  these  applications  are  well  known,  they  are  useful 
for  comparing  alternative  string  processing  implementations. 

A  compiler  may  conveniently  be  divided  into  two  major  portions, 
the  recognizer  and  the  analyzer.  The  recognizer  scans  the  input  string 
of  source  language  characters  and  recognizes  these  as  a  series  of 
tokens  (identifiers,  numbers,  etc.).   The  analyzer  then  uses  informa- 
tion generated  by  the  recognizer  to  determine  the  structure  of  the 
source  program  and  to  generate  the  required  object  code.   Since  the 
structure  of  the  program  to  be  translated  depends  only  upon  the 
syntactic  units  used  and  their  order  of  appearance,  the  syntax  analyzer 
does  not  need  the  actual  name  or  value  of  each  token.   In  fact,  many 
translators  use  the  recognizer  to  rename  tokens  with  internal  identi- 
fiers or  pointers  before  passing  them  to  the  analyzer.   If  the 
internal  identifiers  or  pointers  are  coded  to  indicate  the  classifi- 
cation of  the  token,  then  the  analyzer  has  all  of  the  information  it 
needs  in  a  compact,  well-defined  form.   In  such  a  system,  the 
recognizer  converts  the  string  of  sparsely  and  irregularly  distributed 
source  language  characters  into  a  new  string  of  compact,  standardized 
token  identifiers  or  pointers.   All  translators  must  maintain  a 
table  of  identifier  tokens  encountered.   Since,  in  many  cases,  the 
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internal  identifiers  are  merely  pointers  to  the  table  location  occupied 
by  the  original  token,  the  maintenance  of  token  tables  is  a  natural 
task  for  a  recognizer. 

A  conventional  recognizer  usually  works  with  only  one  symbol 
at  a  time.   Since  the  required  operations  are  often  simple  and 
repetitive,  we  might  ask  if  a  parallel  machine  capable  of  operating 
on  several  characters  simultaneously  could  be  efficiently  used  to  speed 
up  the  process.  The  first  large  scale  implementation  of  a  parallel 
computer  is  the  Illiac  IV  computer  which  is  currently  under  construc- 
tion.  In  this  machine,  the  single  accumulator  of  the  Von  Neumann 
computer  is  replaced  by  multiple  accumulators,  each  having  separate 
subsidiary  registers,  arithmetic  hardware,  and  memory  modules.  The 
machine  was  designed  for  arithmetic  problems  that  use  vectors, 
matrices,  or  meshes,  but  it  has  some  features  that  are  useful  for 
string  processing. 

The  operations  required  for  the  translation  of  a  modern 
fie Id -independent  programming  language  such  as  ALGOL  or  PL/ I  are  a 
challenge  to  implement  on  a  parallel  machine  because  the  basic  source 
program  elements  (tokens)  such  as  numbers  or  identifiers  vary  in 
length  and  separation.   Furthermore,  the  treatment  required  for  a 
given  token  depends  not  only  upon  the  identity  of  the  token,  but  also 
upon  the  context  in  which  it  appears.   Ordinarily,  the  string  of 
input  symbols  is  scanned  from  left  to  right  one  character  at  a  time 
to  insure  that  all  of  the  information  needed  is  available  at  each 
step.  Thus  the  interpretation  of  a  group  of  symbols  such  as  A  +  B 
is  changed  by  the  insertion  of  the  string  delimiter,  ",  in  the  input 
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string  ahead  of  the  group.   Despite  such  obstacles,  it  will  be  shown 
that  a  recognizer  can  be  turned  sufficiently  "inside  out"  to  obtain 
reasonable  efficiency  with  a  parallel  computer  such  as  Illiac  IV. 


2.  THE  ILLIAC  IV  MACHINE 

2.1  The  Quadrant 

The  Illiac  IV  system  is  built  around  four  identical  arrays 
known  as  quadrants .   Each  quadrant  contains  64  processing  units 
(P.U.s)  operating  under  the  control  of  a  single  control  unit  (C.U.) 
in  the  arrangement  shown  in  figure  1. 

The  construction  of  an  Illiac  IV  quadrant  can  be  visualized 
by  considering  a  conventional  computer  arrangement  consisting  of  an 
arithmetic/ logical  section  for  transforming  or  comparing  operands, 
a  control  section  with  the  instruction  decoding  and  sequencing  hard- 
ware, and  a  memory  section.  A  quadrant  essentially  replicates  the 
arithmetic/logical  section  and  the  memory  section  of  the  conventional 
computer  64  times,  but  retains  a  single  control  section.   In  the 
single  quadrant  array  mode  of  operation,  in  which  the  quadrants  act 
independently,  the  64  arithmetic/ logical  units,  named  processing 
elements  or  P  E.s,  are  arranged  in  a  string  with  numbers  from  0  to 
63.  Thus,  each  element  has  an  adjacent  higher  neighbor  and  an 
adjacent  lower  neighbor,  with  P.E.  0  acting  as  the  higher  neighbor 
for  P.E.  63.   Single  words  may  be  transferred  between  neighboring 
P. E.s,  a  process  referred  to  as  "routing."  Additional  routing  paths 
are  provided  between  each  P.E.  and  the  P.E.s  located  eight  positions 
higher  and  eight  positions  lower  in  the  string,  so  each  P.E.  can 
communicate  directly  with  four  other  P.E.s  in  the  quadrant. 


Connections  Co  Other  Control  Units 
and  Input /Output  Controller 
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Figure  1   -   Illiac   IV  Quadrant 


2.2  The  Processing  Element 

The  processing  element  is  a  64  bit  unit  with  the  arithmetic 
equipment  of  a  large-scale  floating  point  computer.  As  shown  in 
figure  2,  the  P.E.  contains  an  accumulator  (register  A),  a  register 
to  hold  the  second  operand  for  binary  operations  or  to  act  as  an 
extension  to  register  A  for  double  length  operands  (register  B) , 
a  temporary  storage  register  (register  S),  and  a  register  provided 
with  the  connections  for  routing  data  to  or  from  one  of  four  other 
P.E.s  (register  R).   In  addition  to  the  aforementioned  64  bit  registers, 
the  P.E.  has  a  16-bit  address  register  (register  X)  and  a  separate 
adder  mechanism  for  address  arithmetic.  Address  arithmetic  is  per- 
formed modulo  2   ,  but  the  memory  accessing  hardware  uses  only  the  11 
least  significant  bits  of  the  P.E.  address  field. 

The  P.E. 8  can  be  set  to  treat  the  64  bit  register  contents 
either  as  64-bit  operands  or  as  pairs  of  "inner"  and  "outer"  32-bit 
operands.   Thus  a  single  instruction  may  cause  64  simultaneous  64-bit 
operations  or  128  simultaneous  3 2 -bit  operations  in  a  quadrant.  All 
but  a  few  of  the  P.E.  instructions  are  applicable  to  both  64-bit 
and  32-bit  operands.  The  word  size  is  changed  by  a  control  unit 
instruction  which  sets  all  P.E.s  of  the  quadrant  to  the  selected  word 
size. 

There  is  a  small  set  of  P.E.  instructions  that  allow  simul- 
taneous addition  or  subtraction  of  the  eight  8-bit  fields  or  "bytes" 
of  the  64 -bit  word.  These  instructions  are  unaffected  by  the  word 
size  state.  They  are  implemented  by  breaking  the  carry  propagation 
in  the  adder  circuitry  at  eight  bit  intervals.  The  same  equipment 
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Figure  2   -  Processing  Element    (P.E.) 
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is  used  to  test  8-bit  fields  for  equality  and  inequality  relations. 
These  tests  operate  on  the  eight  fields  simultaneously  and  leave 
register  A  with  ones  in  the  least  significant  bits  of  byte  fields 
meeting  the  test  conditions.  The  remaining  bits  of  register  A  are 
cleared.   The  character  manipulation  schemes  will  rely  heavily  on  these 
8 -bit  operations. 

Each  P.E.  also  has  an  8-bit  register  not  commonly  found  in 
conventional  computers  known  as  the  mode  register  (register  D).  The 
eight  bits  are  designated  E,  El,  F,  Fl,  I,  G,  J,  and  H  respectively. 
The  I  and  J  bits  record  the  results  of  various  P.E.  test  operations. 
If  the  P.E.  is  in  the  32-bit  mode  so  that  two  operands  are  tested 
simultaneously,  then  the  G  and  H  bits  hold  the  additional  test  results 
corresponding  to  the  1  and  J  bits,  respectively.  The  F  bit  indicates 
arithmetic  overflow  and  is  supplemented  by  the  Fl  bit  in  the  32-bit 
mode.   The  E  and  El  bits,  or  enable  bits,  perform  a  function  unique 
to  the  array  computer.   Setting  these  two  bits  of  a  particular  P.E. 
to  one  causes  that  P.E.  to  operate  in  the  "enabled"  state  by  fully 
executing  any  P.E.  instructions  encountered  by  its  controlling  C.U. 
However,  one  of  these  instructions  may  cause  the  E  or  El  bit  to  be  set 
to  zero,  either  directly  or  by  using  the  value  of  one  of  the  other 
mode  register  bits.  When  this  occurs,  the  P.E.  enters  a  "disabled" 
condition  in  which  registers  A,  S,  and  X  cannot  be  changed  and  memory 
references  are  ignored,  effectively  stopping  activity  in  the  P.E. 
The  E  bit  controls  the  action  of  register  sections  corresponding  to 
the  outer  operand  of  the  32 -bit  word  format,  while  the  El  bit  controls 
the  inner  operand  sections.  Thus,  when  E  and  El  differ  it  is 
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possible  to  modify  32  bits  of  a  register  while  leaving  the  remaining 
portion  of  the  register  unchanged. 

Some  test  instructions  use  the  address  adder  and  will  cause 
the  B  register  of  a  disabled  P.E.  to  be  changed.   Register  R  of  a 
disabled  P.E.  is  also  liable  to  change,  since  it  must  be  available 
for  routing  between  other  enabled  P.E. 8.   Since  P.E. 8  may  be  enabled 
according  to  predetermined  patterns  or  by  the  results  of  test  operations, 
one  can  obtain  the  effect  of  program  branching  by  executing  several 
sets  of  instructions  with  a  different  group  of  P.E.s  enabled  for  each 
set.  This  will  be  used  in  many  places  in  the  character  processing 
routines  to  follow.   Obviously,  such  a  technique  must  be  used  with 
caution  since  disabled  P.E.s  represent  unused  processing  power. 

Processes  such  as  normalization  require  that  a  single  shift 
command  provide  for  differing  shift  counts  in  the  various  P.E.s.  The 
common  technique  of  repeatedly  shifting  a  short  distance  until 
reaching  the  desired  shift  count  was  deemed  unacceptably  slow,  so  a 
cascade  of  shift  gates  that  can  produce  any  shift  or  rotation  from 
0  to  64  bits  in  two  clock  times  is  provided.  This  shifter  is  referred 
to  as  the  "barrel  switch"  in  most  Illiac  IV  descriptive  literature. 
The  shift  count  for  the  barrel  switch  can  be  obtained  from  a  unit 
that  detects  the  position  of  the  first  one  in  the  mantissa  field  of 
the  P.E.  accumulator  (the  leading  one  detector)  or  it  may  be  obtained 
from  a  P.E.  or  C.U.  register.  The  shift  count  may  be  indexed  as  if 
it  were  a  memory  address.  This  feature  is  extremely  useful  for 
masking  and  alignment  operations. 
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2.3  The  Control  Unit 

Figure  3  shows  the  major  sections  of  the  control  unit.  The 
advanced  station  (ADVAST)  portion  decodes,  Interprets,  and  controls 
the  flow  of  all  Instructions  executed  In  the  quadrant.  The  ADVAST  has 
four  64-bit  accumulators  designated  ACAR  0  through  ACAR  3,  and  a  local 
storage  area  consisting  of  64  Integrated  circuit  registers  known  as 
the  ADVAST  data  buffer  (ADB).  ADVAST  also  has  a  24-bit  address  arith- 
metic unit  which  can  use  any  of  the  four  accumulators  as  an  index 
register  divided  as  shown  below: 


1  bit 

15  bits 

24  bits 

24  bits 

Not  used 

Increment  field 

Limit  field 

Index  field 

The  first  bit  of  the  increment  field  (bit  1  of  the  ACAR)  acts  as  a 
sign  for  the  increment.   Indexing  instructions  that  use  the  increment 
field  to  modify  the  index  field  will  subtract  the  increment  from  the 
index  if  bit  1  is  a  one,  and  will  add  the  increment  and  index  otherwise. 
Although  the  arithmetic  section  is  limited  to  24  bits,  ADVAST  can  apply 
any  of  the  standard  Boolean  operations  to  64-bit  operands,  and  can 
set,  clear,  complement,  or  test  any  of  the  individual  bits  in  an 
accumulator.  The  C.U.  can  examine  the  mode  bit  settings  of  the  P.E.s 
in  its  quadrant  with  an  instruction  that  sets  each  of  the  64  bits  in 
a  designated  ACAR  to  agree  with  the  value  of  the  selected  mode  bit 
in  a  corresponding  P.E.   For  example,  if  the  E  bit  were  selected,  a 
one  in  the  last  position  of  the  accumulator  would  indicate  that  the 
E  bit  of  P.E.  63  was  set  to  one  (and  that  P.E.  63  was  enabled). 
An  E  bit  pattern  with  only  zeros  indicates  that  all  of  the  P.E.s  are 
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Figure  3  -  Illiac  IV  Control  Unit 
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disabled,  while  the  complementary  pattern  of  ones  indicates  that  all 
P.E.s  are  enabled.   Such  patterns  may  be  detected  by  C.U.  test 
instructions  that  check  for  all  zeros  or  all  ones  in  the  entire  64-bit 
accumulator  or  in  the  rightmost  24  bits  of  it.   Unlike  P.E.  test 
operations,  which  merely  cause  status  bits  to  be  set,  C.U.  tests 
control  the  instruction  flow  by  skipping  forward  or  backward  in  the 
instruction  stream.   A  pattern  in  an  ACAR  may  be  transferred  to  the 
P.E.  mode  registers  so  that  each  of  the  64  bits  in  the  pattern  sets 
the  selected  mode  register  bit  in  a  different  P.E.  A  novel  feature  of 
the  C.U.  is  the  leading  one  detection  hardware  which  can  replace  a 
64-bit  pattern  in  one  of  the  accumulators  with  a  binary  integer  that 
represents  the  position  of  the  leftmost  one  (or  zero)  in  the  pattern. 
This  facility  is  a  boon  for  finding  special  characters  or  subfields 
in  a  long  chain  of  characters. 

The  second  major  section  of  the  control  unit,  the  final  station 
or  FINST,  receives  only  instructions  that  require  P.E.  action.   FINST 
converts  the  P.E.  instructions  into  sequences  of  appropriate  enable 
and  gate  signals  and  transmits  these  signals  to  all  P.E.s  in  the 
quadrant  over  a  system  of  about  260  control  lines.   FINST  also  sends 
any  operands  or  addresses  needed  from  the  C.U.  by  "broadcasting"  the 
required  data  over  a  64-bit  common  data  bus  (CDB). 

Since  ADVAST  controls  the  instruction  stream,  all  instructions 
pass  through  it  for  decoding  first.   If  the  operation  involves  only 
control  unit  hardware  (a  C.U.  instruction),  the  ADVAST  completes  the 
operation  so  that  the  instruction  never  reaches  FINST.   If  the 
instruction  is  a  P.E.  instruction,  ADVAST  decodes  it,  provides  any 
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indexing  operations  necessary  at  the  control  unit  level,  and  passes 
the  recoded  instruction  and  an  operand,  if  required,  to  FINST  for 
disposal.   Thus  some  instructions  may  be  entirely  processed  by  ADVAST 
while  other  may  pause  in  ADVAST  only  long  enough  for  decoding  before 
being  sent  to  FINST  for  execution.  To  avoid  situations  where  either 
ADVAST  or  FINST  is  idle  waiting  for  the  other  section,  the  instruction- 
operand  pairs  are  passed  from  ADVAST  to  FINST  through  an  eight  word 
first-in,  first-out  final  queue  named  FINQ.   Occasionally,  ADVAST  will 
require  results  from  the  previous  FINST  operation,  usually  when  reading 
test  results  from  the  P.E.s.  In  such  a  case  ADVAST  is  halted  until 
the  FINQ  is  empty.   Otherwise,  FINQ  allows  for  a  considerable  amount 
of  overlap  between  C.U.  and  P.E.  instructions.   This  FINQ  overlap 
ability  makes  program  timing  estimation  difficult,  since  the  total 
execution  time  is  rarely  the  sum  of  ADVAST  (C.U.)  and  FINST  (P.E.) 
time,  although  it  cannot  exceed  this  sum. 

The  three  other  control  unit  sections  are  of  little  interest 
here.   The  instruction  look-ahead  (ILA)  tries  to  maintain  a  supply 
of  new  instructions  in  a  set  of  64  integrated  circuit  instruction 
storage  registers  known  as  the  instruction  word  store.   This  is  done 
by  fetching  blocks  of  eight  instruction  words  (there  are  two  32-bit 
instructions  per  word)  from  quadrant  memory.   Access  to  quadrant 
memory  is  controlled  by  the  C.U. 's  memory  service  unit  (MSU).   The 
remaining  control  unit  section  is  a  test -maintenance  unit  (TMU). 
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2.4  Memory  Addressing 

Each  processing  element  is  provided  with  a  P.E.  memory  unit 
(P.E.M.)  consisting  of  2048  words  of  64 -bit  thin  film  memory.  Thus 
each  quadrant  contains  an  aggregate  of  131,072  words  of  memory  which 
will  be  referred  to  as  quadrant  memory.   P.E.  memory  addresses  usually 
originate  in  the  address  field  of  a  P.E.  instruction.  The  control 
unit  extracts  this  field  when  the  instruction  is  decoded  and  may 
increment  or  decrement  this  address  by  the  contents  of  one  of  four  C.U. 
accumulator  registers  for  a  first  level  of  indexing.  The  address  is 
then  distributed  to  all  P.E.s  in  the  quadrant  where  it  may  be  added 
to  or  subtracted  from  the  contents  of  each  P.E.'s  index  register 
(register  X)  to  provide  a  second  level  of  indexing.   (Note  that 
hardware  design  considerations  have  dictated  that  the  address  be 
subtracted  from  the  index  register  contents  instead  of  the  more  usual 
subtraction  of  the  index  register  contents  from  the  incoming  address. 
Some  index  subtraction  operations  in  the  sequel  may  be  confusing  if 
this  requirement  is  not  kept  in  mind.)  The  16  bit  address  produced 
by  the  P.E.  hardware  to  access  one  of  the  2048  P.E.M.  locations  will 
be  known  as  a  P.E.M.  address.   (The  memory  hardware,  however,  uses 
only  the  least  significant  11  bits  of  this  16  bit  field.)  Normally, 
an  instruction  that  references  a  P.E.M.  address  will  cause  64  memory 
locations  to  be  simultaneously  accessed,  one  in  each  P.E.M.  of  the 
quadrant.   If  no  indexing  is  performed  in  the  P.E.s,  each  location  in 
this  block  of  64  quadrant  memory  locations  will  have  the  same  P.E.M. 
address.   Such  a  block  will  be  called  a  slice.   One  of  the  64  words 
or  slice  elements  in  the  slice  will  be  located  in  P.E.M.  0.  This 
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word  will  be  defined  as  the  slice  leader  of  the  slice.  Since  indexing 
in  the  P.E.s  is  possible,  however,  the  locations  actually  accessed 
all  may  not  be  in  the  same  slice. 

The  control  units  have  no  memory  except  for  a  small  buffer, 
and  obtain  instructions  and  operands  from  the  P.E.M.s.  To  permit  the 
referencing  of  any  location  in  the  four  quadrant  memories,  a  quadrant 
address  of  24  bits  is  created  by  appending  eight  bits  to  the  least 
significant  end  of  the  16-bit  P.E.  address.   The  most  significant  two 
bits  of  the  8 -bit  addition  indicate  the  desired  quadrant,  while  the 
remaining  six  bits  locate  a  specific  P.E.M.  in  that  quadrant.   In  the 
8 ingle  quadrant  mode,  the  quadrant  specification  bits  become  irrelevant, 
and  the  address  arithmetic  is  arranged  so  that,  as  the  addresses  are 
incremented,  the  successor  to  a  given  address  occurs  at  the  same 

r 

location  in  the  next  higher  numbered  P.E.,  except  that  successors  to 
addresses  in  the  last  P.E.  (P.E.  64)  are  located  in  the  next  higher 
numbered  location  in  the  first  P.E.  (P.E.  0).   The  24 -bit  quadrant 
address  has  the  following  format: 


16  bits 


2  bits 


(The  most  significant 
five  bits  are  not 
currently  meaningful.) 


(This  field 
should  not 
change  if  a 
single  quad- 
rant is  being 
used. ) 


6  bits 


P.E.  memory  location 

Quad  number 

P.E.  number 

Note  that  11  of  the  first  16  bits  of  a  quadrant  address  act  as  a 

3 lice  address,  specifying  which  of  the  2048  slices  contains  the  desired 
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address,  while  the  last  six  bits  locate  the  specific  slice  element 
desired. 

The  array  may  operate  with  the  four  quadrants  completely 
separated  or  else  two  or  more  of  the  quadrants  may  execute  the  same 
instruction  stream  and  coordinate  their  routing  operations  to  form  a 
larger  array.   Only  a  single  quadrant  will  be  used  for  the  problem 
In  this  study.   If  the  full  array  of  256  P.E.s  were  available  for 
compilation,  each  quadrant  would  probably  be  assigned  to  a  separate 
compilation  task.   The  majority  of  the  llliac  IV  supervisory  programs 
reside  in  a  B6500  commercial  computer  which  acts  as  the  controller 
for  the  array.   The  B6500  controls  an  extremely  large  disk  file 
system  with  transfer  rates  up  to  5  x  10°  bits/second.  Since 
Input /Output  operations  are  not  directly  controlled  by  the  array, 
this  study  will  assume  that  all  of  the  necessary  information  is 
available  in  quadrant  memory. 
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3.   RECOGNIZER  SPECIFICATIONS 

3.1  The  Source  Language 

To  see  how  a  parallel  processor  may  be  used  for  program 
translation,  we  will  study  an  Illlac  IV  assembly  language  coding  of 
a  recognizer  for  a  dialect  of  the  ALGOL  programming  language  known 
as  Burroughs  Extended  ALGOL.  Thus  the  source  material  should  conform 
to  the  rules  specified  In  the  1966  edition  of  the  Burroughs  Extended 
ALGOL  Language  Manual,  with  a  few  exceptions.  The  Burroughs  Extended 
ALGOL  is  defined  for  a  set  of  63  characters,  each  composed  of  six  bits. 
Since  the  smallest  word  subdivision  provided  in  the  Illiac  IV  instruc- 
tions set  is  an  eight  bit  "byte,"  and  since  newer  communication 
standards  such  as  EBCDIC  (Extended  BCD)  and  ASCII  (American  Standard 
Code  for  Information  Interchange)  provide  for  an  eight  bit  character 
size,  a  source  string  of  eight  bit  characters  will  be  assumed.  The 
first  two  bits  of  each  character  will  be  assumed  to  be  zeros,  and 
the  remaining  six  bits  will  be  used  to  define  input  symbols  according 
to  the  coding  of  appendix  B-l  of  the  Burroughs  Extended  ALGOL  Language 
Manual.  All  constructs  allowed  by  the  manual  should  be  properly  pro- 
cessed by  the  recognizer  with  the  exception  of  the  COMMENT  construct 
and  a  more  stringent  limitation  on  the  use  of  reserved  words. 

A  conventional  compiler  usually  calls  upon  the  recognizer  to 
process  only  one  token  before  returning  control  back  to  the  analyzer. 
However,  if  the  full  capabilities  of  the  Illiac  IV  array  are  to  be 
realized,  it  is  apparent  that  more  than  one  token  must  be  processed 
during  each  recognizer  cycle.  With  the  eight -bit  instructions,  each 
Illiac  IV  quadrant  can  simultaneously  manipulate  512  characters. 
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Since  the  specifications  for  Burroughs  Extended  ALGOL  provide  for 
tokens  from  1  to  63  characters  long,  the  recognizer  is  able  to  examine 
several  tokens  simultaneously.  Thus  the  parallel  recognizer  should 
concurrently  build  and  classify  several  tokens  which  may  not  be  in 
the  same  class. 

ALGOL  is  built  upon  four  classes  of  tokens:  delimiters, 
numbers,  identifiers,  and  strings.  The  Burroughs  specifications 
further  require  that  identifier  and  string  tokens  consist  of  1  to  63 
contiguous  characters.  Two  types  of  delimiter  tokens  appear,  reserved 
words  such  as  BEGIN,  FOR,  PROCEDURE,  and  the  set  of  single  character 
delimiters  or  "special"  characters  consisting  of  character  symbols 
that  are  neither  numeric  nor  alphabetic  symbols.  Care  must  be  taken 
to  insure  that  delimiter  symbols  which  are  components  of  string  tokens 
are  not  confused  with  delimiter  symbols  occuring  in  the  normal 
context.  This  confusion  arising  from  the  string  construct  is  resolved 
quite  naturally  by  recognizers  that  scan  serially,  but  it  presents  a 
problem  for  parallel  recognizers. 

3.2   Internal  Identifiers 

The  final  goal  of  the  recognizer  is  the  production  of 
internal  identifiers  in  the  following  format: 


HEADER 

0  0 0  0 

POINTER 

Bit  0     7  8  39  40  63 

All  internal  identifiers  begin  with  an  eight -bit  header.   If  the 
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internal  identifier  represents  a  delimiter  token,  these  eight  bits 
are  all  zero.  Otherwise,  the  eight  bits  of  the  header  nay  be  repre- 
sented as  follows : 


ABNNNNNN 


The  two  bits  labelled  A  and  B  identify  the  classification  of  the  token 
according  to  the  following  convention: 


A 

B 

0 

0 

Number  token 

1 

0 

Identifier  token 

0  or 

1 

1 

String  token 

The  six  bits  designated  N  form  a  binary  integer  that  gives  the 
number  of  symbols  included  in  the  token.   Delimiter  tokens  belong  to 
a  pre-defined  set  and  are  not  filed  in  a  token  table,  so  their 
internal  identifiers  can  be  simplified.   Delimiters  that  appear  in 
the  input  block  as  a  single  non -alphanumeric  symbol  can  be  represented 
by  identifier  words  having  the  original  symbol  in  the  rightmost  eight 
bits  and  zeros  elsewhere.   Reserved  word  delimiters  such  as  BEGIN, 
PROCEDURE,  FOR,  etc.  can  be  replaced  by  eight  bit  quantifiers  by 
assigning  each  word  a  unique  integer  betwetsu  63  and  256.  These 
quantifiers  can  then  be  used  as  the  last  eight  bits  of  an  identifier 
word  that  has  zeros  elsewhere.   Since  input  symbols  are  limited  to 
six  bits,  the  eight -bit  quantifiers  can  represent  all  possible 
non -alphanumeric  symbols  as  well  as  over  190  reserved  word  delimiter 
tokens  without  conflict.   Only  the  last  eight  bits  of  the  delimiter 
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token  pointer  field  are  used  to  hold  the  actual  delimiter  symbol  or 
a  pseudo- symbol  for  reserved  word  delimiters.   If  the  internal 
identifier  replaces  a  number,  string,  or  identifier  token,  the  pointer 
field  will  contain  the  memory  location  in  a  table  of  the  beginning  of 
the  stored  token. 

3.3  Tokens 

Tokens  that  are  not  delimiters  are  arranged  to  facilitate 
filing  them  in  a  table.   Each  of  these  tokens  begins  with  the  same 
eight -bit  header  that  is  used  in  the  corresponding  internal  identi- 
fier. The  two  classification  bits  of  the  header  allow  number,  string, 
and  identifier  tokens  to  be  held  in  the  same  table.   The  character 
count  indicates  the  number  of  bytes  following  the  header  that  contain 
meaningful  characters,  and  it  can  be  used  to  determine  the  number  of 
words  occupied  by  the  stored  token.   Obviously,  no  token  may  occupy 
more  than  eight  words.   Delimiter  tokens  contain  exactly  one  input 
symbol  which  is  placed  in  the  rightmost  eight  bits  of  the  one  word 
token.   The  rest  of  the  word,  including  the  header  field,  consists  of 
zeros  so  that  a  delimiter  token  and  its  internal  identifier  are  identi- 
cal. The  zeros  in  the  header  field  distinguish  a  delimiter  token  from 
the  other  token  types.   Since  the  header  is  an  essential  part  of 
every  token,  any  subsequent  use  of  the  term  token  will  be  to  refer  to 
a  construct  having  an  eight  bit  token  header  followed  by  a  symbol 
tail.  The  symbol  tail  is  made  up  of  the  input  symbols  of  the  identi- 
fier, number,  or  string  followed  by  enough  zeros  to  reach  a  word 
boundary.   The  symbol  tail  for  a  delimiter  consists  of  fifty-six 
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zeros  followed  by  the  original  delimiter  character.  Thus,  tokens  will 
appear  in  one  of  the  following  formats : 

DELIMITER  TOKEN: 


HEADER 
(All  zeros) 


NUMBER,  IDENTIFIER,  OR  STRING  TOKEN: 


DELIMITER 
SYMBOL 


A  B  N  N  N  N 


NXXXXXXXX 


YYYYYYYY 


41- 


00000000 


SECOND  SYMBOL   ZEROs'tO  END  OF  WORD 

(If  necessary) 


HEADER 


FIRST  SYMBOL 


(A  and  B  represent  the  two  bit  classification  code.  The  N's 
form  a  six  bit  character  count.) 


Although  the  recognizer  could  be  required  to  convert  number  tokens  to 
a  machine  format,  this  step  will  be  left  for  the  analyzer.  Number 
tokens  will  be  formed  exclusively  from  digit  symbols  and  will  be  added 
to  the  table  as  if  they  were  identifier  or  string  tokens.  Thus, 
floating  point  quantities  may  appear  as  several  number  tokens  separated 
by  appropriate  delimiters. 


3.4  Sections 


The  recognizer  is  confronted  with  a  block  of  512  characters, 
many  of  them  blank,  that  form  tokens  containing  from  1  to  63  symbols 
each.  Each  of  the  64  P.E.s  in  a  quadrant  holds  eight  of  the  input 
characters,  thus  partitioning  the  512  input  symbols  into 
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eight -character  sub-blocks.  The  P.E.  boundaries  also  divide  the  tokens 
symbols  into  sections  of  eight  symbols  or  less.  Consider,  for  example, 
the  following  input  statement: 

ANS  4.  IDENTIFIER  +  24  TIMES  SUM; 

This  statement  could  appear  in  P.E.  memory  with  the  following  grouping: 


P.E.  0 


A 

N 

S 

4- 

I 

P 

.E 

l 

L 

D 

E 

N 

T 

I 

F 

I 

E 

P.E.  2 


Rf 


M 


P 

.E 

3 

E 

S 

S 

u 

M 

> 

Note  that  TIMES  is  not  an  identifier  token,  but  a  reserved  word  which 
replaces  the  delimiter  symbol  X  .   Since  reserved  words  have  the  same 
structure  as  identifier  tokens,  the  token  building  routine  treats  them 
as  identifier  tokens  which  will  be  recognized  as  delimiters  later  in 
the  table  maintenance  process.   To  minimize  interference  from  the  P.E. 
boundaries,  the  scanner  first  gathers  the  characters  into  sections 
using  the  section  builder,  and  then  assembles  the  sections  into  complete 
tokens  with  the  section  joiner.   Sections  created  by  section  builder 
will  ultimately  be  combined  to  become  tokens,  so  they  are  constructed 
with  the  same  format  as  tokens;  number,  identifier,  or  string  sections 
have  an  eight -bit  header  followed  by  a  symbol  tail,  while  delimiter 
sections  consist  of  a  single  word  containing  a  delimiter  symbol 
preceded  by  fifty -six  zeros.   Obviously,  the  symbol  tail  of  a  section 
contains  at  most  eight  symbols.   If  all  of  the  symbols  of  a  token 
occur  within  the  same  P.E.,  the  section  formed  from  these  symbols  will 
be  a  complete  token.  This  is  always  the  case  for  delimiter  tokens, 
that  are  not  reserved  words  since,  as  noted  before,  section  builder 
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treats  reserved  words  as  identifiers  end  can  thus  assume  that  all 
delimiter  tokens  contain  exactly  one  character.  All  of  the  tokens 
in  the  example  except  the  tokens  for  the  identifier  named  IDENTIFIER 
and  the  reserved  word  TIMES  generate  sections  that  are  complete  tokens. 
The  token  for  TIMES  is  built  from  two  sections;  the  token  for 
IDENTIFIER  requires  three  sections. 
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4.   THE  CHARACTER  CLASSIFIER 

4.1  The  Classification  Process 

Section  builder  does  not  directly  examine  the  input  symbols 
to  determine  the  action  to  be  taken,  but  is  guided  by  a  control  string 
in  which  there  is  an  eight-bit  control  byte  corresponding  to  each  input 
character.   This  control  string  is  generated  by  a  character  classifier 
that  begins  by  placing  each  of  the  512  input  symbols  into  one  of  the 
following  classes: 

ol  -  symbol:  An  alphabetic  letter. 

(3  -  symbol:   A  blank  symbol  that  does  not  belong  in  a  string 

token. 
V-  symbol:   A  symbol  that  is  part  of  a  string  token.   (A 

symbol  properly  enclosed  by  quote  symbols.) 
6-  symbol:   A  delimiter  symbol--a  symbol  that  does  not  fit 

into  any  of  the  other  classifications. 
1-   symbol:   A  numeric  symbol  (Digit). 

4>-   symbol:  A  null  character.   (Should  not  appear  in  any  output 
token. ) 
Conventional  recognizers  apply  a  series  of  tests  to  each  individual 
character,  using  the  results  of  one  test  to  determine  the  next  test 
until  the  symbol  is  identified  with  a  specific  symbol  class.  The 
parallel  recognizer  uses  the  eight -bit  relational  test  instructions  to 
test  all  of  the  512  input  characters  simultaneously.  The  only 
characters  that  change  the  classification  of  other  characters  are  the 
quote  characters  which  always  signal  the  presence  of  string  tokens, 
so  the  classifier  checks  for  these  symbols  first.   If  quote  symbols 
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are  present,  every  odd -numbered  quote  symbol  should  be  the  beginning 
of  a  string  token  that  will  be  terminated  by  the  next  even-numbered 
quote  character.   If  the  previous  block  of  source  symbols  ended 
with  an  uncompleted  string  token,  then  the  roles  of  the  odd  and  even 
numbered  quote  symbols  are  interchanged.   If  any  quote  symbols  are 
detected,  a  marker  bit  is  generated  for  each  symbol  in  the  string 
tokens.  To  do  this,  the  bits  that  mark  quote  symbols  are  shifted 
right  one  character  position  in  each  P.E.  and  added  modulo  two.  The 
process  is  repeated  eight  times.   The  bit  corresponding  to  the  right- 
most character  in  each  P.E.  now  indicates  if  an  uneven  number  of 
quote  signs  occurred  in  that  P.E.   These  bits  are  accumulated  in  an 

*  AR  and  the  process  is  repeated  for  63  shifts.   Each  one  in  the  ACAR 
uuw  indicates  that  the  markers  in  the  P.E.  corresponding  to  that  ACAR 
bit  position  should  be  complemented.   When  this  has  been  done,  the 
newly  generated  bits  mark  all  of  the  symbols  that  belong  to  a  string 
token. 

Unfortunately,  the  routine  just  described  breaks  down  if  a 
"three-quote"  construct  is  encountered.   This  construct  arises  be- 
cause the  Burroughs  character  set  does  not  include  the  right  and 
left  quote  symbols  used  in  the  ALGOL  report.   Thus,  while  quote 
characters  may  be  readily  inserted  into  the  string  tokens  described 
in  the  ALGOL  report,  a  group  of  three  successive  quote  symbols  is 
required  to  represent  a  single  quote  character  inside  of  a  string 
token  in  Burroughs  Extended  ALGOL.   Since  three  consecutive  quotes 
upset  the  modulo  two  scheme  used  in  generating  the  string  symbol 
markers,  such  groups  must  be  detected  and  remedial  action  must  be 
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taken  first.  This  is  done  by  detecting  pairs  of  adjacent  quote  symbols, 
deleting  their  markers  from  the  set  of  quote  markers,  and  then  marking 
them  as  null  characters  to  prevent  their  appearance  in  the  output 
tokens.   If  a  single  quote  symbol  follows  such  a  pair,  the  quote  marker 
corresponding  to  this  symbol  is  deleted  so  that  the  now  unmarked  quote 
sign  will  be  ignored  by  the  modulo  two  routine.  Sequences  of  several 
continguous  quote  symbols  must  be  reduced  by  repeated  use  of  the  quote 
pair  remover.   An  unfortunate  by-product  of  the  removal  of  quote  symbol 
pairs  is  a  shrinkage  in  the  length  of  the  token.  As  we  will  see,  the 
possibility  that  there  may  be  gaps  in  the  symbol  tails  necessitates 
the  use  of  more  general  (and  thus  more  complicated)  section  builder 
and  section  joiner  schemes. 

After  string  components  have  been  properly  identified,  the 
character  classifier  uses  a  series  of  eight -bit  inequality  tests  to 
separate  numeric  and  alphabetic  characters  from  the  delimiter  characters, 
Digits  may  be  easily  isolated,  but  the  Burroughs  coding  requires 
several  inequality  tests  to  identify  the  alphabetic  characters,  as 
they  occur  in  three  groups  separated  by  delimiter  symbols.  This  is 
of  little  concern  to  a  parallel  recognizer  since  each  step  is  being 
applied  to  512  characters  instead  of  just  one  symbol.  The  results 
of  these  tests  are  recorded  in  a  marker  string.  Each  eight -bit  byte 
of  the  marker  string  corresponds  to  one  of  the  input  symbols  and 
indicates  the  classification  of  that  symbol  using  the  following  coding: 


0 

1 

2 

3 

4 

5 

6 

7 

4 

P 

Y 

* 

6 

« 

Bits  0  and  3  are  not  used.  The  exaaple  input  statement  and  its 
corresponding  marker  string  are  given  below: 
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P.E.    0 


A 

N 

3 

4- 

I 

cCcC 


9 


P 

.E 

L 

D 

E 

N 

T 

I 

F 

I 

E 

tt 

ot 

W  oi*  UcLU 

P.E. 

2 

R  +  2  4 

T 

I  M 

oi 

S/l  If 

oi  U  oi 

P 

,E 

.   3 

I 

E 

S 

S 

U  M 

> 

of  of 

P 

* 

d  *i  6 

P 

4.2     The  Control  String 

The  marker  string   is  easily  converted  into  a  control  string   in 
which  each  eight -bit  byte  is  coded  in  the  following  manner: 

0  1234567 


YAS 


The  o£,  V,  and  6  indicators  have  the  same  meaning  as  they  did  in 
the  marker  string.  The  two  new  bits  of  control  information  have  the 
following  significance: 

X  :   Indicates  a  null  character  or  a  blank  character  that  is 

not  contained  in  a  string  token.   (Should  not  appear  in 

any  output  token.) 
£    Section  store  indicator.  Marks  either  a  delimiter  symbol 

or  a  blank  symbol  that  is  not  contained  in  a  string  if 

the  preceding  symbol  is  not  a  delimiter  symbol  or  a 

non-string  blank  symbol. 
The  control  string  format  is  designed  so  that  the  control  bytes  may 
be  loaded  directly  into  the  eight -bit  P.E.  mode  registers  to  control 
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the  section  assembly  process.  The  £  indicators  appear  in  bit  positions 
0  and  1  since  both  the  E  and  E  1  enable  bits  must  be  set  to  one  to 
completely  enable  a  P.E.   Bit  positions  2  and  3  are  the  fault  bits  in 
the  mode  register.  These  are  always  set  to  zero  to  avoid  complications 
with  the  interrupt  hardware.  The  example  input  statement  would  give 
rise  to  the  following  control  string: 
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5.   THE  SECTION  BUILDER 

5.1  The  Section  Builder  Process 

Section  builder  processes  the  eight  characters  held  in  each 
of  the  64  P.E.s,  beginning  with  the  leftmost  character  and  moving 
right  under  the  direction  of  the  control  string.  The  input  characters 
are  held  in  the  R  registers  of  the  P.E.s.  The  section  builder  begins 
constructing  a  partial  section  in  each  P.E.  This  partial  section 
consists  of  a  symbol  tail  in  register  A  and  a  header  in  register  S. 
The  eight -bit  header  is  kept  in  bit  positions  53  through  60  of  the 
S  register,  an  offset  of  three  bit  positions  from  the  right  end  of 
the  register.  Thus,  the  character  count  of  the  header  is  incremented 
by  adding  eight  instead  of  one.  The  reason  for  this  will  become  clear 
later.   Initially,  the  partial  section  symbol  tail  and  header  are  both 
all  zeros.   If  an  alphabetic,  numeric,  or  string  character  is  encountered, 
this  character  is  extracted  from  the  set  of  input  characters  and  added 
to  the  end  of  the  symbol  tail  in  register  A.  The  character  count  in 
register  S  is  also  incremented.   If  the  new  symbol  is  an  alphabetic 
character,  the  first  bit  of  the  partial  section  header,  located  in 
bit  53  of  the  S  register,  is  set  to  one.   Bit  54  is  set  to  one  when  a 
string  character  is  added.   If  a  delimiter  symbol  is  encountered  in 
the  input,  the  partial  section  previously  being  accumulated  (if  any) 
is  assembled  into  a  complete  section  and  stored  in  P.E.  memory.   The 
delimiter  symbol  is  then  extracted  from  the  set  of  input  symbols,  and 
placed  in  the  rightmost  eight  bits  of  register  A  to  form  a  delimiter 
section,  which  is  also  stored  in  P.E.  memory.   Register  A  and  register 
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S  are  then  both  cleared  to  begin  a  new  partial  section.  The  occur- 
rence of  a  blank  symbol  that  is  not  a  part  of  a  string  token  will 
also  cause  the  section  builder  to  assemble  and  store  any  unfinished 
partial  sections,  but  the  blank  symbol  itself  will  be  discarded. 
The  section  builder  does  not  need  to  directly  examine  the  input  symbols 
to  determine  the  action  to  be  taken  since  its  behavior  is  completely 
determined  by  the  control  string.  The  first  step  in  processing  a  new 
input  character  is  to  load  the  control  byte  corresponding  to  that 
symbol  into  the  P.E.'s  mode  register.   This  immediately  disables  the 
P.E.s  that  do  not  have  partial  sections  to  be  stored.  When  the  partial 
section  storage  operation  is  complete,  the  mode  register  bits  are 
rearranged  to  enable  only  P.E.s  that  are  required  to  extract  a  non- 
blank,  non-null  character  from  the  set  of  input  characters.   The 
section  builder  continues  in  this  fashion,  using  the  mode  bits  to 
insure  only  the  correct  P.E.s  are  active  at  each  step  until  all  of 
the  possible  operations  on  the  current  character  have  been  completed. 
It  then  proceeds  to  the  next  character  and  loads  a  new  control  byte 
into  the  mode  register  until  all  eight  of  the  characters  in  each  P.E. 
have  been  assimilated. 

The  kernel  of  the  section  builder  is  the  process  of  simul- 
taneously extracting  the  itn  character  from  the  eight-character  input 
word  in  each  of  the  enabled  P.E.s.   This  could  be  done  by  masking  a 
copy  of  the  input  word  in  each  enabled  P.E.  with  a  pattern  properly 
positioned  in  an  ACAR.   However,  since  it  is  important  not  only  to 
extract  a  character  from  the  word  in  which  it  is  imbedded,  but  also 
to  shift  that  character  to  a  position  aligned  with  the  end  of  a 
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symbol  tail,  an  alternate  method  which  dispenses  with  the  mask  in 
favor  of  a  series  of  shift  operations  was  adopted.  As  noted  before, 
each  P.E.  is  equipped  with  a  shifter,  which  is  capable  of  shifting  a 
64  bit  operand  in  either  direction,  end-off  or  end-around,  in  a  single 
clock  time.   Even  with  the  necessary  instruction  decoding  and  other 
overhead,  the  shift  instructions  require  only  three  clock  times  for 
any  desired  shift.   Furthermore,  the  shift  count  can  be  indexed 
separately  in  each  P.E.,  so  that  a  single  instruction  may  produce  a 
variety  of  shift  lengths  in  the  different  P.E. a.   The  section  builder 
uses  two  end-off  shifts  to  extract  the  character  from  the  input  word 
and  place  it  in  the  rightmost  eight  bits  of  register  A.  The  word  is 
first  shifted  left  far  enough  to  left  justify  the  selected  character. 
Next,  a  56  bit  right  shift  leaves  the  desired  character  in  the  right- 
most byte  of  a  word  that  contains  zeros  elsewhere.   Delimiter  characters 
in  this  position  need  no  further  treatment  before  they  are  stored  as 
completed  tokens.   Otherwise,  the  extracted  character  is  added  to 
the  partial  section  symbol  tail  which  is  then  shifted  left  eight  bits 
so  that  another  symbol  may  be  added  to  the  right  end. 

When  a  symbol  tail  is  to  be  joined  with  its  companion  header 
to  form  a  completed  section,  the  exact  position  of  the  right  end  of 
the  symbol  tail  becomes  unimportant,  and  the  tail  must  be  shifted  so 
that  its  leftmost  symbol  is  aligned  with  the  header  to  be  added.   A 
rotation  (end-around  shift)  to  the  right  is  used  for  this.  The  symbol 
tails  may  vary  from  one  to  seven  characters  in  length,  so  the  shift 
distance  will  vary.  The  correct  shift  counts  are  derived  from  the 
character  count  portion  of  the  header  accompanying  the  symbol  tail. 
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Displacing  the  header  three  bits  from  the  right  of  the  S  register 
effectively  multiplies  the  character  count  by  eight  to  give  the 
necessary  shift  count  increment.   After  the  header  in  the  S  register 
is  OR-ed  with  the  symbol  tail,  an  eleven  bit  right  rotation  aligns 
the  completed  section  for  storage. 

Given  the  example  input  statement,  the  section  builder  would 
produce  the  following  result: 

R  Register  (Input  string) 
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A  Register  (Partial  section  symbol  tails) 
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The  underscored  digits  represent  header  bytes,  and  give  the  value  of 
the  character  count  field  in  the  header. 
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6.   THE  SECTION  JOINER 

6.1  The  Section  Joiner  Process 

Many  of  the  sections  constructed  by  section  builder  are 
complete  tokens.  The  amalgamation  of  the  remaining  sections  into 
tokens  is  the  function  of  the  section  joiner.  A  P.E.  that  contains 
the  beginning  section  of  a  mult i -section  token  is  designated  as  a 
RECEIVER  P.E.  since  this  P.E.  receives  the  remaining  sections  of  the 
token  from  adjacent  P.E.s  and  adds  them,  one  at  a  time,  to  the  growing 
partial  token  until  a  completed  token  has  been  assembled.  The  ending 
section  of  a  multi-section  token  will  always  be  the  first  token  in 
its  P.E.,  which  is  called  a  GENERATOR  P.E.   Finally,  LINK  P.E.s  hold 
the  intermediate  or  LINK  sections  of  tokens  which  include  three  or  more 
sections.   A  P.E.  which  is  a  RECEIVER  for  one  token  may  be  a  GENERATOR 
for  another  token,  but  a  LINK  P.E.  must  contain  exactly  one  section 
and  cannot  also  be  either  a  RECEIVER  or  a  GENERATOR  P.E.  The  section 
joiner  maintains  three  64  bit  patterns  in  separate  ACAR  registers  to 
indicate  which  P.E.s  are  RECEIVERS,  GENERATORS,  or  LINKS.  The  registers 
in  each  P.E.  are  tested  to  determine  if  any  complete  sections  were 
stored  by  the  P.E.,  if  any  partial  sections  remain  in  the  registers 
of  the  P.E.,  or  if  the  section  store  indicator  of  the  first  control 
byte  in  the  P.E.  was  zero.   The  patterns  resulting  from  these  tests  are 
used  to  produce  the  RECEIVER,  GENERATOR,  and  LINK  patterns.   In  the 
case  of  the  example  input  statement,  P.E.  0  and  P.E.  2  would  be 
considered  as  RECEIVER  P.E.s,  P.E.  2  and  P.E.  3  would  be  GENERATOR 
P.E.s,  and  P.E.  1  would  be  a  LINK  P.E. 
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6.2  Character  Count  Determination 

The  sections  are  combined  by  merging  all  of  the  section 
headers  into  a  single  token  header  and  by  concatenating  the  section 
symbol  tails.   As  in  the  assembly  of  single-section  tokens,  the  symbol 
tail  of  the  first  partial  section  must  be  aligned  so  that  it  will  be 
adjacent  to  the  token  header.   Subsequent  partial  section  symbol  tails 
are  received  from  neighboring  P.E.s  and  must  be  shifted  to  align  them 
with  the  previously  assembled  symbols.   If  the  "three -quote"  convention 
for  inserting  quote  symbols  into  strings  did  not  exist,  all  inter- 
mediate or  LINK  partial  sections  would  contain  exactly  eight  symbols  and 
all  sections  except  the  beginning  section  would  be  shifted  identically 
during  the  symbol  concatenation  process.   Unfortunately,  however,  the 
"three-quote"  convention  can  shorten  the  length  of  LINK  sections,  so 
a  separate  shift  count  must  be  used  whenever  a  new  partial  section 
symbol  tail  is  joined  to  the  token.   As  before,  the  character  count 
fields  of  the  section  headers  provide  the  necessary  shift  counts.   The 
R  registers  are  used  to  send  the  character  count  of  the  first  section 
in  each  P.E.  to  lower  numbered  P.E.s  in  shift  register  fashion.  The 
P.E.s  add  the  shift  count  of  their  last  section  to  the  incoming  shift 
counts  to  determine  the  number  of  symbols  that  would  be  obtained  by 
joining  two  sections.   The  R  register  contents  are  again  routed  to 
the  next  lower  numbered  P.E.s  and  the  addition  process  is  repeated. 
The  resulting  sums  now  give  the  number  of  symbols  included  in  three 
sections.   Repeating  the  process  and  saving  the  sums  obtained  at  each 
step  finally  produces  a  counter  string  of  eight  sums.  Each  sum 
represents  the  number  of  symbols  accumulated  in  a  successive  stage 
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of  the  section  joining  process.  The  same  steps  are  applied  to  the 
classification  bits  from  the  headers,  only  the  bits  are  "OR"-ed 
together  instead  of  added.  These  are  combined  with  the  character 
counts  to  form  a  word  containing  eight  potential  headers  in  each  P.E. 
A  RECEIVER  P.E.  forming  a  token  from  only  two  sections  uses  its  right- 
most header  byte  as  the  header  for  the  assembled  token.  A  P.E.  that 
forms  a  token  using  the  maximum  of  nine  sections  uses  its  leftmost 
header  byte  in  that  token. 

6.3  Symbol  Tail  Concatenation 

The  process  of  joining  the  symbol  tails  is  best  illustrated 
by  examining  the  steps  used  in  assembling  the  multi-section  tokens, 
IDENTIFIER  and  TIMES  in  the  example  input  statement.   First,  the 
chain  of  header  bytes  is  temporarily  stored  in  P.E.  memory  so  that 
the  information  remaining  from  the  section  builder  process  may  be 
returned  to  the  S  and  A  registers,  with  the  symbols  in  the  A 
register  left  justified  in  the  register.  The  symbol  tail  of  the 
first  section  in  each  P.E.  is  also  left  justified  and  loaded  into 
the  R  register  for  routing  to  lower  numbered  P.E. 8.  P.E.s  processing 
the  example  input  would  contain  the  following  information  at  this 
point : 


R  Register  (First  section  symbol  tails) 


36 


A 

N 

S 

D  E 

N 

T 

I 

F 

I 

E 

R 

E 

S 

S  Register  (Partial  section  headers) 


i 

8 

3 
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All  P.E.s  that  are  not  RECEIVER  P.E.s  are  now  disabled.   This  does  not 
affect  the  routing  action  of  the  R  registers,  but  it  allows  only 
RECEIVER  P.E.s  to  perform  the  other  symbol  joining  operations.   In  the 
example,  only  P.E.  0  and  P.E.  2  will  be  active  initially. 

The  R  register  contents  are  routed  to  the  left  to  begin  the 
first  cycle.   (The  example  is  constructed  as  if  there  were  only  four 
P.E.s  in  the  quadrant,  so  P.E.  0  routes  to  P.E.  3.)  Each  RECEIVER  P.E. 
shifts  the  symbols  in  the  A  register  eight  bits  to  the  right  to  make 
room  for  a  header  and  then  moves  them  to  the  B  register.  The  incoming 
symbol  tail  is  brought  into  the  A  register  and  is  shifted  right  by 
one  character  position  more  than  the  number  of  characters  given  the  by  the 
character  count  in  the  S  register.  The  symbols  in  the  A  and  B  registers 
are  now  prrmerly  aligned.  After  shifting  the  symbols  of  the  example 
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statement,  the  registers  of  the  two  RECEIVER  P.E.s  appear  as  follows: 


P.E.  0 
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The  symbols  in  the  A  and  B  registers  are  then  "OR"-ed  together  and 
the  eight  header  bytes  are  loaded  into  the  S  register.  The  bit  in 
the  "eights"  position  in  the  character  count  is  compared  with  the 
corresponding  bit  in  the  previous  character  count.   If  the  bit  has 
changed,  register  A  is  filled  with  symbols  and  must  be  stored.   Bit  60 
of  the  S  register  is  monitored  for  this  purpose  in  the  first  cycle. 
In  the  example,  P.E.  0  would  be  the  only  P.E.  to  store  its  A  register 
contents.   If  the  A  register  contents  are  stored,  the  symbols  in  the 
R  register  are  again  transferred  to  the  A  register  and  the  symbols 
that  did  not  fit  into  the  A  register  before  are  passed  into  the  B 
register  using  a  double  length  shift.  The  pattern  of  GENERATOR  bits 
in  the  C.U.  is  now  shifted  left  and  compared  with  the  RECEIVER  pattern. 
A  GENERATOR  bit  that  is  shifted  into  coincidence  with  a  RECEIVER  bit 
cancels  that  bit  and  causes  the  corresponding  P.E.  to  be  disabled  for 
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the  rest  of  the  concatenation  process.   In  the  example,  P.E.  2  is 

disabled  in  this  way  after  the  first  cycle. 

The  second  cycle  begins  by  routing  the  R  register  contents  to 
the  left  again.   P.E.  0  would  be  the  only  P.E.  active  at  this  point 
In  the  example,  and  the  registers  of  this  P.E.  would  have  the  following 
contents  : 
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As  before,  the  incoming  symbols  are  transferred  from  the  R  register 
to  the  A  register  and  shifted  into  alignment  with  the  symbols  in  the 
B  register  before  the  symbol  groups  are  "0R"-ed  together.   The  cycles 
continue  until  the  absence  of  RECEIVER  P.E.s  signals  the  end  of  the 
concatenation  process.   In  the  example,  this  occurs  at  the  end  of 
the  second  cycle.   In  any  case,  the  process  is  stopped  after  eight 
cycles.  All  RECEIVER  P.E.s  are  then  reactivated  to  store  any  symbols 
remaining  in  the  A  registers  and  to  insert  the  appropriate  header 
byte  in  the  first  word  of  the  new  multi-section  tokens.  The  tokens  are 
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now  completely  formed  and  are  ready  for  storage  in  the  token  table  or 
addition  to  the  output  string.  At  this  point,  the  example  input  state- 
ment symbols  are  stored  in  P.E.  memory  as  follows: 
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7.      TABLE  MAINTENANCE 

7.1  Table  Use 

After  the  tokens  have  been  created,  the  recognizer  retrieves 
them  one  at  a  time  and  creates  an  internal  identifier  for  each  one. 
Delimiter  tokens  are  transferred  to  the  output  string  without  change. 
Identifier  tokens  are  compared  with  a  reserved  word  table  so  that 
reserved  words  masquerading  as  identifiers  can  be  detected  and  re- 
placed with  an  appropriate  delimiter  type  internal  identifier.   Other 
identifier  tokens  as  well  as  number  and  string  tokens  are  located  in 
the  main  token  table  or  are  entered  in  this  table  if  no  matching  entry 
is  found.   The  address  of  the  token  in  the  main  table  is  then  combined 
with  the  header  of  the  token  to  create  the  internal  identifier  that 
represents  the  token  in  the  output  string. 

7.2  The  Reserved  Word  Table 

The  recognizer  uses  two  tables,  the  reserved  word  table  and 
the  main  token  table.   The  two  tables  could  be  combined,  but  the 
entries  in  the  reserved  word  table  are  predetermined  single  word 
entries  that  are  never  changed  by  the  recognizer,  so  a  simpler  arrange- 
ment may  be  used  for  this  table.  The  Burroughs  Extended  ALGOL  manual 
lists  111  reserved  words,  but  58  of  these  are  reserved  words  only 
when  used  in  certain  contexts.   For  simplicity,  the  parallel  recognizer 
arbitrarily  treats  the  entire  set  as  reserved  words  in  all  situations. 
(If  desired,  this  restriction  could  be  relaxed  by  treating  the 
semi -reserved  words  as  identifiers  and  allowing  the  analyzer  to 
detect  them. )  The  reserved  word  table  entries  are  stored  in 
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consecutive  quadrant  memory  locations  and  fit  easily  into  two  slices 
of  64  words  each.   The  query  word  is  broadcast  to  all  P.E.  accumulators 
where  a  P.E.  test  instruction  is  used  to  compare  the  query  with  the  64 
words  in  the  first  slice  of  the  reserved  word  table.  The  test  instruc- 
tions report  a  successful  match  by  setting  a  specified  mode  register 
bit  in  each  P.E   The  test  bit  pattern  is  read  into  an  ACAR  for 
examination.   If  no  ones  appear  in  the  pattern,  none  of  the  64  table 
entries  accessed  match  the  query.   Otherwise,  the  leading  one  detector 
of  the  C.U.  is  used  to  identify  the  P.E.  containing  the  matching 
entry.  Essentially,  the  quadrant  is  being  used  as  a  64  word  associa- 
tive memory  here.  Only  two  P  E.  memory  cycles  are  required  to 
exhaustively  search  the  reserved  word  table.   Since  the  entries  in  the 
reserved  word  are  predetermined  and  do  not  change,  the  fastest  con- 
ventional search  technique  orders  these  entries  so  that  the  set  of 
possible  matching  entries  is  divided  in  half  each  time  a  comparison 
is  made.   On  a  serial  machine,  such  a  binary  search  procedure  would 
require  at  least  six  memory  cycles  to  determine  that  a  given  query 
word  is  not  contained  in  this  reserved  word  table.   The  only  reserved 
word  that  contains  more  than  seven  characters  is  PROCEDURE.   A  test 
for  this  word  is  made  separately  from  the  search  for  shorter  reserved 
words  to  simplify  the  table. 

7.3  The  Main  Token  Table 

The  token  table  is  much  more  complicated  to  search  and  maintain 
than  the  reserved  word  table.   Since  new  entries  are  constantly 
being  added  to  this  table,  it  is  not  feasible  to  order  the  contents. 
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However,  since  the  table  may  grow  to  include  several  hundred  entries, 
it  may  not  be  feasible  to  examine  every  table  entry  for  each  new 
query.   Conventional  recognizers  attack  this  problem  by  dividing  the 
table  into  blocks  (sometimes  called  "buckets").   Some  property  of  all 
or  part  of  the  query  word  is  used  to  associate  this  word  with  exactly 
one  block.   The  recognizer  can  then  limit  the  search  to  the  table 
entries  stored  in  the  selected  block.   Obviously,  this  scheme  is  an 
advantage  only  if  the  entries  are  fairly  evenly  distributed  among  the 
blocks.   Often,  many  of  the  identifiers  are  similar  to  one  another. 
For  example,  all  of  the  variables  associated  with  a  certain  process 
may  begin  with  the  same  letter.   To  keep  this  clustering  effect 
from  adversely  affecting  the  distribution  of  entries  among  table 
blocks,  parts  of  the  query  word  are  usually  transformed  in  some  way 
to  further  "randomize"  the  keys  obtained.   Since  Illiac  IV  has  high 
speed  multiplication  hardware,  the  query  word  is  multiplied  by  a 
"hash  constant"  to  use  as  many  bits  of  the  query  as  possible  in  the 
randomizing  process.   A  four  bit  hash  key  is  derived  from  this  process. 
The  four  bits  designate  one  of  the  16  hash  blocks  used  in  the  table. 
The  table  is  designed  to  hold  a  minimum  of  1024  entries  with  room 
for  2048  entries  under  optimum  conditions.   The  organization  of  the 
table  is  evident  from  the  following  diagram  of  the  table  portion  of 
a  typical  P.E  memory: 
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MEMORY  ADDRESS 
(B  ■  Base  Address) 

B  -  1 
B  +  0 
B  +  1 


CONTENTS  OF  MEMORY  CELL 

End  of  table  pointer 
First  word  of  main  block  #0 
First  word  of  main  block  #1 


B  +  15 
B  +  16 
B  +   17 


First  word  of  main  block  #15 
Second  word  of  main  block  #0 
Second  word  of  main  block  #1 


B  +  31 
B  ♦  32 
B+  33 


Second  word  of  main  block  #15 
First  word  of  extension  block  #0 
First  word  of  extension  block  #1 


B  +47 
B  +  48 
B  +  49 


First  word  of  extension  block  #15 
Second  word  of  extension  block  #0 
Second  word  of  extension  block  #1 


B  +  63 
B  +  64 
B  +  65 


Second  word  of  extension  block  #15 
Continuation  word  #0 
Continuation  word  #1 
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Each  P.E.  has  space  for  one  entry  for  each  of  the  16  main  blocks. 
Two  locations  are  provided  for  each  entry,  the  second  location  16 
words  after  the  first  one.  Zeros  are  stored  in  the  first  word 
locations  when  the  table  is  set  up  so  that  empty  spaces  in  a  block 
may  be  found  by  matching  with  a  query  word  of  all  zeros.   A  second 
set  of  16  blocks  is  provided  to  extend  main  blocks  with  more  than  64 
entries.   The  extension  blocks  are  linked  to  the  main  blocks  or  to 
other  extension  blocks  through  a  set  of  32  link  words  kept  in  the 
ADVAST  Data  Buffer  of  the  C.U. 

When  an  entry  is  stored  in  the  table,  the  first  location  in 
the  table  receives  the  first  word  of  the  token;  a  header  followed  by 
up  to  seven  characters.   The  information  stored  in  the  second  word 
of  the  table  depends  upon  the  length  of  the  token  being  stored.   If 
the  token  contains  more  than  seven  but  less  than  sixteen  characters, 
the  eighth  character  is  stored  in  the  leftmost  byte  of  the  second 
word,  followed  by  the  rest  of  the  characters  of  the  token  and  enough 
zeros  to  fill  out  the  memory  word.   If  the  token  contains  more  than 
fifteen  characters,  the  second  word  of  the  table  entry  is  loaded  with 
the  address  of  a  continuation  word  that  holds  the  second  word  of 
the  token.   The  remainder  of  the  token  is  stored  in  subsequent 
continuation  words.   An  end  of  table  pointer,  stored  in  the  memory 
location  preceding  the  first  main  block  word,  holds  the  address  of 
the  last  continuation  word  used. 

The  search  for  a  match  to  a  query  token  is  begun  by  multiplying 
the  proper  section  of  the  first  word  of  the  query  by  the  hash  constant 
and  then  extracting  the  four  bit  hash  key.   The  hash  key  is  added 
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to  a  base  address  to  obtain  the  slice  address  of  the  first  word  of 
the  selected  main  block.  The  query  word  is  compared  with  the  set  of 
64  first  word  entries  for  the  selected  main  block.   If  no  matches 
occur,  the  link  corresponding  to  the  main  block  is  checked  to  see  if 
the  block  has  been  extended.   If  so,  the  first  word  entries  of  the 
associated  extension  blocks  are  also  compared  with  the  first  query 
word.  When  a  match  is  found,  the  next  word  of  the  query  token  is 
broadcast  to  the  P.E.s  for  matching.  An  examination  of  the  token 
character  count  indicates  whether  the  second  word  of  the  entry  should 
be  matched  against  the  second  word  of  the  query  token  or  used  to 
access  a  continuation  word.   If  several  long  tokens  with  identical 
beginning  portions  are  stored  in  the  table,  the  first  comparisons 
may  produce  several  bits  indicating  matches.  These  bits  are  "ANO"-ed 
with  the  bits  produced  by  subsequent  comparisons  until  at  most  one 
match  bit  remains  when  all  of  the  words  in  the  query  token  have  been 
used.   Note  that  this  process  examines  several  of  the  candidates  for 
a  match  in  parallel,  where  a  serial  table  routine  might  try  several 
entries  that  do  not  quite  match  the  query  before  locating  the  correct 
matching  entry. 

Once  an  entry  is  located  in  the  table,  the  leading  one  detector 
is  used  to  determine  which  P.E.  holds  the  entry.  The  P.E.  number  is 
combined  with  the  slice  address  of  the  entry  to  produce  a  quadrant 
address.   It  is  this  address  that  appears  in  the  output  string. 
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8.   RESULTS 

8. 1  Execution  Time  Data 

Execution  times  were  estimated  for  the  parallel  recognizer 
for  a  variety  of  conditions.  As  noted  before,  the  total  execution 
time  is  not  the  sum  of  the  individual  instruction  execution  times 
due  to  the  overlap  between  P.E.  and  C.U.  instructions  created  by  the 
final  queue  (FINQ)  and  by  the  action  of  the  C.U.'s  instruction  look- 
ahead  unit. 

All  intermediate  calculations  are  expressed  in  Illiac  IV 
clock  times.   (An  Illiac  IV  clock  period  is  about  40  nanoseconds.) 
However,  since  it  is  desirable  to  have  some  measure  that  is  not 
dependent  on  circuit  speed,  the  final  results  are  expressed  in 
equivalent  memory  cycles.   An  Illiac  IV  memory  cycle  requires  six 
c  1  oc  ks . 

The  routines  that  assemble  the  512  input  characters  into 
tokens  are  evaluated  separately  from  the  table  maintenance  and 
output  string  generation  procedures.   The  former  is  evaluated  on  a 
memory  cycle/input  character  basis  while  the  latter  is  judged  on  a 
memory  cycle/ output  token  basis. 

Table  1  gives  the  execution  times  obtained  for  the  token 
building  routines.   Four  possible  input  character  sets  are  considered: 

CASE  I:    Best  possible  case.   No  quote  symbols  appear  in  the 
input  and  the  longest  token  is  two  words  in  length. 

CASE  II:   Longest  possible  tokens  (eight  words)  appear  but  no 
quote  symbols  are  present  in  the  input. 
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CASE  I   CASE  II   CASE  III   CASE  IV 

Set-up  routines  1560     1560      1560      1560 

Quote  string  builder  and 

first  cycle  of  quote-pair 

routine  —      --      1282      1282 

Quote -pair  routine 

-additional  cycles  --      --       --       2340 

Marker  and  control 

string  generator  195      195      195       195 

Section  builder  728      728      728       728 

Section  joiner 

-minimum  portion  662      662       662       662 

Section  joiner 

-additional  cycles  --      595      595       595 


Total  clocks                  3145     3740      5022  7362 

Equivalent  memory  cycles         525      624      837  1227 

Memory  cycles /Input  character    1.03     1.22      1.63  2.40 
Table  1  -  Token  Building  Routines -Execution  Times 
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CASE  III:   Longest  possible  tokens  as  well  as  quote  symbols 

appear  in  the  input.  No  more  than  three  quote 

symbols  appear  contiguously. 
CASE  IV:   Worst  possible  case.  This  very  rare  input  string 

includes  63  consecutive  quote  symbols  which  form  a 

maximum  length  token. 
The  execution  time  stays  near  to  one  memory  cycle/ input  character  unless 
quote  symbols  are  present.  Then  the  increase  is  slight  unless  an 
absurd  number  of  contiguous  quote  symbols  appear.   Obviously,  the 
three -quote  convention  would  be  one  of  the  first  things  omitted  in  a 
language  designed  for  a  parallel  recognizer.   A  conventional  recognizer 
would  probably  operate  the  fastest  on  a  set  of  all  blank  input  symbols, 
but  even  then  at  least  two  memory  cycles  would  be  required  for  each 
symbol  scanned.   As  more  and  more  non-blank  symbols,  especially 
delimiter  symbols,  appear  in  the  input  string,  a  conventional  recognizer 
slows  noticeably.   Thus  the  parallel  token  building  routines  provide 
a  definite  speed  advantage  when  processing  non-trivial  input  strings. 
The  table  maintenance  and  output  string  routines  do  not  show 
a  similar  advantage,  however.  The  execution  time  required  for  these 
routines  depends  upon  a  multitude  of  factors  including  the  length  of 
the  token,  the  number  of  tokens  already  in  the  table,  the  distribution 
of  the  table  entries  among  the  table  blocks,  and  the  existence  of  a 
table  entry  that  matches  the  query  token.  The  execution  times  for 
four  widely  differing  cases  are  presented  in  table  2.  The  cases 
selected  are  as  follows : 
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CASE  I   CASE  II   CASE  III   CASE  IV 


Preparation  of  token  for 
table  search 

Table  search 

Entry  of  token  into 
table 

Creation  of  output 
internal  identifier  for 
token 


110 
44 


139 
104 


24 


42 


164 
77 

77 

42 


298 
77 

283 
42 


Total  clocks 


178 


285 


360 


700 


Equivalent  memory  cycles/Token      30       48_ 


60 


117 


Table  2  -  Table  Maintenance  Routines -Execution  Times 
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CASE  I:    A  single  word  reserved  word  token.   (Naturally 
there  is  a  matching  entry  for  this  token  in  the 
reserved  word  table.) 
CASE  II:   A  single  word  string  token  that  has  a  matching  entry 

already  in  the  table. 
CASE  III:  A  single  word  identifier  token  that  does  not  match 

any  entry  in  the  table. 
CASE  IV:   A  maximum  length  token  (eight  words)  that  does  not 
match  any  entry  in  the  table. 
The  parallel  table  maintenance  routines  show  little  if  any,  gain  over 
conventional  schemes.   This  occurs  because  each  token  receives  a 
considerable  amount  of  individual  attention  by  the  C.U.,  mostly  in 
the  preparation  of  the  token  for  the  table  search.   An  algorithm  that 
performed  some  of  the  more  common  pre-search  steps  in  the  P.E.s  would 
help  to  reduce  the  dominance  of  the  preparation  step  in  the  total 
time.   Possibly  a  linear  table  search  could  be  used  for  the  main  token 
table,  especially  if  less  than  500  tokens  is  expected.   But  no  matter 
what  modifications  are  adopted,  a  well  designed  conventional  table 
routine  is  not  easily  aclipsed. 

Since  conventional  serial  recognizers  are  usually  embedded  in 
a  compiler,  etc.,  an  exact  comparison  cannot  be  made,  but  the  Illiac  IV 
parallel  computer  should  provide  an  increase  in  string  processing 
speed  of  from  two  to  fifteen  times.  Thus  the  parallel  computer  should 
be  a  useful  tool  for  string  processing. 
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APPENDIX 


ILLIAC  IV  ASSEMBLY  LANGUAGE 
LISTING  OF  PARALLEL  RECOGNIZER 


.QUOTSiEQU 
.BTMSKlEGU 
,LIM3«  EOu 
.LIM7I  EQU 
.  L  I  M<*>3  I  EOU 
.ENDSTIFOu 
t 
.HlNBRlEOU 

.BLANKIEOU 
.HAMsKjfou 

,CYC^6lEQU 
.IOENT«FOU 
.STRNGlEQU 
.GENRl  EQU 
.NLlMK»EOu 
.RCVRl  EQU 
.ACTIVlEQU 
,HASM«  EQu 
.HMSKI  Equ 
.WRDCTiEOU 
.CLIMTiEou 
.LASTXlEOU 
.XLIMTIEQU 
.ALIMt  FQU 

.PROCDlEQU 

.REl    EQu 

.POYNTiFqu 

.HOMSKIEQU 

.OUTPTlFQU 

.LOALFlEQU 
* 

.HIALF'EOU 
% 

.QUERYtEOU 
.QUERPlFOU 

•XBASElEOU 
% 

.CNTRi 

FRSTRI 

«BASF« 
TSIZF, 

nxsrt  i 

OLDMKI 

MARKS  I 
KTENni 


EQu 
EQU 
F<5U 
E(5U 
EQU 
EQu 

Fqu 

EQU 


**** 
SOlj 
S02J 
SD3) 
SD4; 
SD5j 
$06) 

SO?) 
SD8J 
SD9j 

SDlOJ 

soin 

SD12) 
SD13| 

SOU' 
SD15) 

$016; 

SOU) 
SDU| 

sou; 
so2o; 

S021I 
SD22J 
SD23) 
SD24I 
SD25I 
SD26I 
*D27) 
SD28) 
SD29) 

S032| 

SD36) 
$037) 
»D«6j 

S063| 

«8192; 

.72192) 

.510, 

64*256) 
64x257| 

64x258; 

64*259) 


LABEL  DEC 
PATTE 
MASK 
0  STE 
0  STE 

0  STE 
TELLS 

ENOE 
PATTE 
PATTE 
MASK 
A  STE 
MASK 
MASK 

FLAG 

TNDIC 

FLAG 

FLAG 

MASH 

MASKS 

WORD 

LIMIT 

NUMBE 

LIMIT 

1  STE 
"9PR0 
"REOO 
TEMPO 
MASKS 
OUTPU 
USES 

FOR 
USES 

FOR 
USES 
QUERY 
USES 

EXTE 
COUNT 
FIRST 
256*T 
SET  T 


LARATIONS 
RN  OF  EIG 

ALL  BUT  E 
P  1  UNTIL 
P  1  UNTIL 
P  1  UNTIL 

IF  PREVI 
0  WITH  AN 
RN  OF  BYT 
RN  OF  S  B 
OUT  LEFT 
P  8  UNTIL 
ALL  BUT  I 
ALL  BuT  S 
BITS  FOR 
ATES  NON- 
BITS  FOR 
BITS  FOR 
COOING  CO 

ALL  BUT 
COUNT  OF 

OF  CONTI 
R  OF  LAST 

OF  EXTEN 
P  1  INDEX 
CEDU"  PAT 
0000"  PAT 
RARY  sTor 

OUT  ALL 
T  STRING 
S029-SD31 

alphabeti 

$D32-$034 

alphabeti 

SD36-S044 

♦  1 
$D46-$D6l 
NSION  BLO 
ER  FOR  SE 
STORE  FLA 
BASE 

able  size 


**** 

HT  QUOTE  C 
ACH  BYTES 

3  INDEX  P 

7  INDFX  P 

63  INDEX 
OUS  INPUT 

UNTERMINA 
ES  ■  ULAR 
LANK  CHARA 
4  BITS  OF 

56  INDEX 
DENTIFTER 
TRING  MARK 
GENERATOR 
LINK  P.E.S 
RECEIVER  P 
P.E.S  STIL 
NSTANT 
4-BlT  HASH 
QUERY 
NUATION  WO 

EXTENSION 
SION  BLOCK 

PATTERN  ( 
TERN 
TERN 

AGE  FOR  PO 
BUT  LEFTMO 
INDEX  REGI 

FOR  TEST 
C  CHARACTE 

FOR  TFST 
C  CHARACTE 

FOR  QUERY 


HARACTERS 

LAST  BIT 

ATTERN 

ATTERN 

PATTERN 

BLOCK 

TED  STRING 

GEST  NUMBR 

CTERS 

EACH  BYTE 

PATTERN 

MARKS 

S 

P.E.S 

,E.S 

L  ACTIVE 

KEY 

RD  AREA 
BLK  USED 
AREA 

LIM.O) 


Inter 

ST  BYTE 
STER 
BYTES 
RS 

BYTES 
RS 
STRING 


FOR  LINKS  TO 
CKS 

CTION  JOINER  CYCLES 
G  BIT 

TO  228  wORqS/P.E, 


MARKER  STRING  FROM  PREVIOUS  CYCLE 
MARKER  STRING  FOR  CURRENT  CYCLE 
FND  MARKER  FROM  PREVIOUS  CYCLE 


00000200 
00000300 
00000400 
00000500 
00000600 
00000700 
00000600 
00000900 
00001000 

oooouoo 

00001200 
00001300 
00001400 
00001500 
00001600 
00001700 
00001800 
00001900 
00002000 
00002100 
00002200 
00002300 
00002400 
00002500 
00002600 
00002700 
00002800 
00002900 
00003000 
00003100 
00003200 

00003300 
00003400 
00003500 
00003600 
00003700 
00003600 
00003900 
00004000 
00004100 
00004200 
00004300 
00004400 
00004500 
00004600 
00004700 
00004800 
00004900 
00005000 
00005100 
00005200 
00005300 
00005400 

00005500 
00005600 
00005700 
00005800 
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0 
0 

0 
0 

1 
1 

2 
2 
3 
3 

0 
0 

1 
1 

2 

2 
0 
0 
1 
1 
2 
2 
3 
3 
0 
0 
1 
1 
2 
2 
3 
3 
E 

Tl 
|Aj 


(0) 


64x260;   *  TEMP  STORAGE  FOR  PARTIAL  SECTION 
64x261;   X  TEMPORARY  STORAGE  FOR  X  REGISTER 
64x262)   I  TEMPORARY  STORAGE  FOR  S  REGISTER 
64x263;   *  HEADER  OF  TOKEN  BEING  FORMED 
64x264;  %    ADDRESS  OF  LAST  CONTIN.  WORD  USED 
64X265)   *  FIRST  SLICE  OF  RESERVED  WORD  TABLE 
64x266)   I  SECOND  RESERVED  WORD  TABLE  SLICE 
64x267;   I  I  USES  LOCATIONS  267-282  FOR 

COMPLETED  SECTIONS 
64x266)   X  SECTION  ♦  1 
64x283;  %    TBASE  "  1 

64X284)   X  USES  LOCS  284-511  FOR  MAIN  TABLE 
64x2B5|   X  TBASE  ♦  1 
64x512;  %    SOURCE  STRING 

**#*  SET-UP  PROCESS  **** 
■000  01 00000007000000010 | 8) 

.CYC56;   x  8  STEP  *  UNTIL  5*  INDEX  PATTERN 
■000401 00200401 0020040M 8) 

.bTmSk)  x  mask  all  but  each  bytes  last  bit 

■0074 170360741 70360741 7 |8l 

»HAMSK;   I  MASK  OUT  LEFT  4  BITS  OF  EACH  BYTE 

■  037477  176374771 7637477 1 8; 

.OUOTS)   I  PATTERN  OF  EIGHT  QUOTE  CHARACTERS 

■000001 0000000300000000 i 6; 

•LlM3j    X  0  STEP  1  UNTIL  3  INDFX  PATTERN 

•000  001 0000000700000000  I  8) 

,LIM7J    X  0  STEP  1  UNTIL  7  INDEX  PATTERN 

■  0000010000007700000000  I 6| 

.LIM63)   X  0  STEP  1  UNTIL  63  INDEX  PATTERN 

.ENDST) 

■030060140300601403006018) 

.BLANK;   x  PATTERN  OE  8  BLANK  CHARACTERS 

«005ol2o24o5ol2o24o5ol2|8; 

,HlNBR)   *  PATTERN  OF  BYTES  ■  ULARGEST  NUMBR 

■01 00200401 00200401 0020 | 8) 

.LOALF)   X  FIRST  ALPHA  GROUP  TEST  BYTES 

■  0150320641503206415032  1 6; 
•HIALE) 

■020040100200401002004016) 
,L0AIE*1)X  SECOND  ALPHA  GROUP  TEST  BYTES 
■025o52l2425o52l2425o52|8; 

.HIALF41) 

■030461 142304611 42  30461 t 8) 

.L0ALF*2)X  THIRD  ALPHA  GROUP  TEST  BYTES 

■035072164  35072164  3507218) 

,HlA|.F*2) 

-E.OR.E)  X  ENSURE  ALL  P.E.S  ARE  ENABLED 

•E.ORtE) 

NXSRC) 
OLDMK) 
MARKS) 
XTEND' 
■0000006777777777777777181 


00005900 
00006000 
00006100 
00006200 
00006300 
00006400 
00006500 
00006600 
00006700 
00006800 

00006900 
00007000 
00007100 
00007200 
00007300 
00007400 
00007500 
00007600 
00007700 

00007800 
00007900 
00006000 
00008100 
00008200 
00008300 

00008400 
00006500 
00006600 
00008700 

00008800 
00008900 
00009000 
00009100 
00009200 
00009300 
00009400 
00009500 
00009600 

00009700 
00009800 
00009900 

00010000 
00010100 
00010200 
00010300 
00010400 
00010500 
00010600 
00010700 
00010800 
00010900 
00011000 

00011100 

00011200 
00011300 
00011400 
00011500 
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AGAIN  t 


STL(O) 
LTTd) 
STLU) 
L  T  T  (  2  ) 

5  T  L  (  2  ) 

L  T  T  (  3  ) 

<>tlc3i 

L  TT(0) 

LTTCl) 
C|  0(2)  J 
S  T  l  (  2  3 
T  X  i.  T  H  (  1  ) 

LTTC3J 

l_r>4 

STfl 

l*;.(0) 

CI  RAJ 

S  'A 

ST  A 

fVi  TM(O) 

STLfO, 
I  »TfO) 
-  •l.Cl) 
w  r  T  C  2  ) 
L'2) 
L  »  T  ( 'J  ) 
S  T  L  f  0 ) 
l  »T(1) 
STL(1) 

L  ^A 

L^L(O) 

NFB 

Lr'i.  (  3) 

NFB 

SMAL 

l"- 

Lncf 0) 

ir\p 
RU 

C I .  C  (  1  )  I 

C«.B 

I.  HA 
SFTE 

sftf! 
SmAr 

L.OX 


.HASHJ  «    SET    UP    HASH    CONSTANT 

«36|8| 

,HMS<J  *    4-BIT    HASH    KEY    MASK 

»32j 

.lastxj  x  riRST  Extension  block  begins 

IN  THE  33RD  MAIN  TABLE  WORD 
-  6  3  » 
.XLlMT;  %    I AsT  EXTENSION  BLOCK  ENDS 

IN  64TH  WORD  OF  THE  MAIN  TABLE 
*TSlZE*64J 

.CLIMTJ   I  SET  LlMlT  OF  TABLE 
■OOOD010000001700000000I8I  *  0  STEP  1  TO  15 

.XBAsE(l))  *  InItIAlIzE  16  ExTEnsION 

»~2\  *       BLOCK  LINKS  TO  ZERO 

*64j 

*c3j      %    riRST  CONTINUATION  WORD  Is 

LASTCJ    *   LOCATED  IN  65TH  MAIN  TABLE  WORD 

.LIM63)   X  SET  ACAROINCR  *  ♦!#  ACAROLIM  ■  63 


TBASMI 
TPASEC 

»-2; 

aOOOOO 
.ALIMI 
"17760 
.HDMS* 
aOOOOO 
.OUTPT 
■10444 
.PROCD 
■0?442 
.REj 

**** 
SORCEj 
.OUOTS 

tcoi 
„btmsk 

SC3; 
31 
SAJ 
IAJ 

»N00J0 

ss; 

63J 

*R> 

63| 

sci ; 

NXSRC) 


-E.OR.E* 
-E.OR.E) 
48J 


0)JI  INITIALIZE  TABLE  ENTRIES  TO  ZERO 

10000000000000001  I  8  I 

I  SET  ACAROINCR  ■  *1»  ACAROINDX  ■  I 
00000000000000000181 

J   I  MASKS  OUT  ALL  BUT  LEFTMOST  BYTE 
lOOOlOOOOOOOOOOOOlS)  %    SET  UP  INDEX 
i       X  FOR  8192  WORD  OUTPUT  STRING 
7l222302305212064|8| 
t       *  SET  UP  •,9PR0CE0U,»  PATTERN 
5 00000 0  000000000018  J 

I  SET  UP  "REOOOOOO*  PATTERN 
OUOTF-PAIR  ROUTINE  **** 

*  LOAD  SOURCE  STRING 
i       X  LOAD  PATTERN  OF  8  QUOTE  SYMBOLS 

X  CHECK  FOR  QUOTE  CHARACTERS 
i 

X  OBTAIN  MARK  FOR  EACH  QUOTE  SYMBOL 

X  MOVE  MARK  TO  PROPER  POSITION 

X  SAVE  MARKS 

x  "0RW  marks  Into  acaro 

i       I  SKIP  IF  NO  QUOTES  FOUND 

X  10AD  QUOTE  MARKERS  FOR  ROUTING 
X  ROUTE  MARKERS  LEFT  ONE  P.E. 


SAJ 


X  SET  RIGHTMOST  BlT  OF  ACARl 

X  FNABLE  ONLY  RIGHTMOST  PtE, 

%  BRING  NExT  SOURCE  STRING  WORD  IN 

AT  RIGHT  END  OF  STRING 

X  RE-ENABLE  ALL  P.E.S 

X  FXTRACT  LEFTMOST  TWO  BYTEs 

X  SAVE  RYTES  IN  REGlSTFR  X 
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SMAR 

81 

LOB 

SAJ 

LnA 

is; 

SMAL 

8J 

OR 

»BJ 

ANON 

SSI 

LOR 

SAJ 

LnA 

SSI 

SHAL 

161 

flR 

sx; 

And 

SRJ 

Lnc(O) 

SAJ 

ZFRTCO) 

#SBitDj 

LOR 

SAJ 

RTL 

U 

LOB 

SAJ 

LnA 

SRJ 

CROTR(l) 

U 

LnEFl 

sci; 

LnA 

OLDMKJ 

STR 

oldmki 

SFTE 

"EtOR.EJ 

SFTEl 

-E.OR.El 

?TAR 

17) 

.nx 

SAJ 

>HAR 

56J 

»MAL 

56| 

swap; 

>HAR 

9J 

]R 

SB! 

.nR 

SAJ 

)p 

SSJ 

.ns 

SAJ 

nA 

SXJ 

HAL 

56J 

ns 

SAJ 

nA 

SRJ 

HAR 

a; 

9 

SBJ 

nR 

SAJ 

R 

SSJ 

ns 

SAJ 

OA 

SRJ 

HAL 

8J 

9 

sx; 

TAR 

15j 

AND 

ss; 

^S 

SAJ 

<IP 

#AGAINJ 

">A 

SSJ 

*  NOW  QET  LEFTMOST  BYTE  ALONE 

I  SAVE  LEFTMOST  BYTE  In  REGISTER  B 

S  RELOAO  ORIGINAL  QUOTF  MARKERS 

I  MOVE  MARKERS  LEFT  ONF  BYTE  POS, 

*  FORM  NOOUOTE-QUOTE  MARKERS 

I  MOVE  MARKERS  LEFT  TWO  BYTE  POS. 

I    OBTAIN  NOOUOTE-quOTE-quOTE  mArkErc 

I    -OR"  MARKERS  INTO  ACARO 

I    SKIP  IF  NO  MORE  MARKpRS  REMAIN 

«  MOVE  NEW  MARKERS  RIGHT  ONE  P.E, 


%   fnable  only  leftmost  P.E. 

I  OBTAIN  MARKS  FROM  PRFVIOUS  CYCLE 
%    SAVE  MARKER  STRING  WORD  SHIFTED 

OUT  RIGHT  END  FOR  NFXT  CYCLE 
I  RE-ENABLE  ALL  P.E.S 


%    RE-ASSEMBLE  MARKERS  SHIFTED 
RIGHT  9  BITS 


I  RE-AssEMBLE  MARKERS  SHIFTED 
RIGHT  17  BITS 

*  SET  NULL  MARKERS  FOR  SECOND  SYMBOL 
IN  CHAINS  OF  THREE  QUOTES 


I    RESET  QUOTE  MARKER  FOR  THIRD  QUOTE 
SYMBOL  IN  CHAINS  OF  THREE  QUOTES 


t    BEGIN  "MODULO  TWO"  PROCESS 


00017300 
00017400 
00017500 
00017600 
00017700 
00017800 
00017900 
00016000 
00016100 
00016200 

00016300 
00016400 
00010500 

00018600 
00018700 
00018800 
00016900 
00019000 
00019100 

00019200 
00019300 
00019400 
00019500 
00019600 
00019700 
00019600 
00019900 
00020000 

00020100 
00020200 
00020300 
00020400 
00020500 
00020600 
00020700 
00020800 
00020900 
00021000 

00021100 
00021200 

00021300 

00021400 
00021500 

00021600 
00021700 
00021600 
00021900 
00022000 
00022100 

00022200 
00022300 
00022400 
00022500 
00022600 
00022700 
00022600 
00022900 
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CSHL(3) 
AMD 
LOB 
LHL(O) 

SMAR 

TyLTM(O) 
in 

>FTC(  1  ) 
L<U(2) 
ZTRF(O) 
CnMPC(l )J 

I  ni_(2) 

' -^MR(2) 
C^XORC?) 
TXLTM(0> 
Ct.C(l)) 
CSBt1  ) 
CANDCl  ) 
STL 
0SHR(2) 

rri 

.:  n  m  p  a  i 

LR  A 
A«  3 
SMAR 

OB 

L^S 


ALOOP» 


LHA 

LHR 

LHL(O) 

;  ■"-■ 
SmAL 

op 

5TA 
LOL(O) 

LHA 

Lnul) 

G* 

LOS 

L'^A 

LOLC1) 

LR 

AND 

Oft 

STA 


*  ,BTMSK  IS  ALREADY  IN  ACAR3 

X  extract  only  quote  markers 

X  SET  ACAROINCR  -  *1#  ACAROLIM  m    7 
X  REPEAT  THE  PROCESS  EIGHT  TIMES 


3* 

*C3j 

SAJ 

.LIM7J 
8j 
SBI 
*-3j 

60J 
I» 

.ENosTj 

»U 

sell 

.LIM63J   X  SET  ACAROINCR  ■  *1»  ACAROLlM  ■  63 
U 

scij 
,-3; 

63; 
SC2J 

.ENDSTJ 

U 

SC2J 


-E.OR.E-I 
-E.OR.EJ 
SAJ 


x  re-enable  all  p.e.s 


ssj 

SC3| 


X  GET  ORIGINAL  QUOTE  MARKERS  AGAIN 
X  ALIGN  QUOTE  MARKERS  WITH  NULL 

MARKER  POSITION  IN  MARKER  STRING 
X  SET  REMAINING  QUOTE  MARKS  TO  NULLS 
X  SAVE  MARKER  STRING 
****  MARKER  STRING  GENERATOR  **** 
SORCEJ    X  BEGIN  NUMBER  MARKER 

LOAD  SOURCE  STRING 
X  SAVE  A  COPY  OF  THE  SOURCE  STRING 
X  I OAO  NUMBER  TEST  BYTES 
X  MARK  ALL  DIGIT  CHARACTERS 
X  SHIFT  MARKS  TO  NUMBER  MARKER  POSN 
X  ADD  MARKERS  TO  MARKER  STRING 
%    SAvE  MARKER  STRING 

END  NUMBER  MARKING 
X  BEGIN  MARKING  ALPHABETIC  SYMBOLS 

SET  AcArOIncR  ■  ♦*'  AcArOlIm  ■  3 
SRI 

ic°*1  C0>x  test  for  alphabetic  characters 

$AJ       X  SAVE  TEST  RESULTS  IN  REGISTER  S 
SR|       X  RELOAD  SOURCE  STRING  IN  REGISTER  A 

scil^^^  COMPLETE  TEST  FOR  ALPHA  SYMBOLS 
SSI       X  COMBINE  ALPHA  TEST  RESULTS 
MARKS!    X  ADD  ALPHA  MARKERS  TO  MARKER  STRING 
MARKS'    *  SAVE  MARKER  STRING 


SR> 
SAJ 


SAJ 

.HINBRJ 

SCOJ 

6J 

SSJ 

MARKSJ 

.LIM3J 


* ALOOPJ   X  END  ALPHA  MARKING  AFTER  TESTING 

FOR  THREE  ALPHA  GROUPS 
SA' 
SRI 


.BLANK) 

SCI; 

.BTMSKJ 

sen 

5) 
•SI 

SA) 


S  BEGIN  BLANK  SYMBOL  HARKING 

RELOAD  SOURCE  STRING 
X  LOAD  BLANK  CHARACTER  TEST  BYTES 

x  test  for  blank  characters 
s  obtain  marker  for  each  blank 

S  MOVE  MARKS  TO  BLANK  MARKER  POSTN 
X  ADD  NEW  MARKERS  TO  MARKER  STRING 
X  BEGIN  DELIMITER  MARKING 


SKJ 


SCO) 
SCI) 

1) 
SS) 

** 

SA) 

,BTM 

5) 

SC3) 

SA) 

SS) 

2) 

SC3j 

2) 

SR) 
SA) 
SS) 
2) 

SC3| 

4) 

SR) 

SA) 
56) 
SA) 
1) 

SR) 


1) 

SCO) 
XTEND) 

XTEND) 
•E.OR»E| 

•E.OR.E) 

8) 
SB) 


X  OBTAIN  MARK  FOR  ALL  CHARACTERS  NOT 

PREVIOUSLY  MARKED 
X  SHIFT  DELIMITER  MARKER  INTO  PLACE 
X  ADD  DELIMITER  MARKER  TO  MARKER  STRING 
**  CONTROL  STRING  GENERATOR  **** 


X  FXTRACT  BLANK  CHARACTER  MARKERS 


X  MOVE  MASK  TO  STRING  MARKER 

POSITION  IN  BYTE 
X  EXTRACT  STRING  MARKERS 

X  ROTATE  TO  BIT  POSITION  $2    OF  BYTE 
S  MARK  NON-STRING  BLANk  CHARACTERS 


S  MOVE  MASK  TO  DELIMITER  MARKER  POSTN 
X  EXTRACT  DELIMITER  MARKERS 
X  SHIFT  TO  BIT  POSITION  #2  OF  BYTE 
X  MARKERS  NOw  INDICATE  EITHER 

DELIMITERS  OR  NON'STRING  BLANKS 


x  set  bit  #0  of  acaro 

s  enable  only  leftmost  p,e, 

x  add  marker  from  previous  pass  to 

left  of  string 
x  save  end  marker  for  next  pass 
%   re-enable  all  p.e.s 


X  RE-ASSEMBLE  STRING  SHlFTEO  RIGHT 
EIGHT  BITS 


00028700 
00026600 
00026900 

00029000 
00029100 
00029200 
00029300 
00029400 
00029500 
00029600 
00029700 
00029S00 
00029900 
00030000 

00030100 
00030200 

00030300 
00030400 
00030500 
00030600 
00030700 
00030600 
00030900 
00031000 

00031100 
00031200 
00031300 
00031400 
00031500 
00031600 
00031700 
00031600 
00031900 

00032000 
00032100 
00032200 

00032300 
00032400 
00032500 
00032600 
00032700 
00032600 
00032900 
00033000 
00033100 

00033200 
00033300 
00033400 
00033500 
00033600 
00033700 
00033800 

00033900 
00034000 
00034100 
00034200 
00034300 
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NAND  SRI       *  CREATE  PARTIAL  SECTION  STORAGE  INC. 

shal  i)      i  move  indicator  to  bit  position  #i 

LDB  $Ai 

S^AL  ll 

OR  SBI       *  DUPLICATE  INDICATOR  BIT  In  BlT 

POSITION  #0  OF  BYTE 

LOR  $A| 

LnA  $si 

LOLCO)  .HAMSKi 

AND  SCO;      I  MASK  OUT  LEFT  4  BITS  OF  EACH  BYTE 

OR  SRI       t  ADD  PARTIAL  SECTION  STORAGE 

INDICATORS  TO  CREATE  CONTROL  STRNG 

STA  MARKS)    *  CONTROL  STRING  IS  NOW  REAqY  FOR  USE 

****  SECTION  BUILDER  **** 
CLC(D)  *  PEGIN  SECTION  BUILDER  SET-UP 

COMPC(l)j  x  ACAR1  PATTERN  TO  RAPIDLY  ENABLE 

ALL  P.E.S 

LnL(2)  .IDENTI  %    SET  UP  FLAG  BIT  FOR  IDENTIFIERS 

LnL(3)  , STRNGj   X  SET  UP  F|_AG  BIT  FOR  STRING  SECTIONS 

LOR  SORctj    X  LOAD  SOURCE  STRING  IN  REGISTER  R 

CI  RA' 

Lns  $ai 

Lnx  *A) 

STA  SECTNl 

STA  SECTPJ    X  FND  SECTION  BUILDER  SET-UP 

LOB  MARKSJ    I  lOAD  FIRST  CONTROL  BYTE 

LOD  SB!       *  SET  MODE  REGISTERS 

SFTE  -E.OR.-EIX  COMPLEMENT  SECTION  STORAGE 

SFTEl  -El.OR.-EUX  INDICATORS 

XT  .FRSTRJ   *  SET  FlRSTSTORE  FLAG 

LnEFl  tClj      I  PE-ENABLE  ALL  P.E.S 

STTE  -H.AND.EII  CONSIDER  ONLY  NON-BLANk#  NON-NULL 

STTF1  -H.AND.ElIX  CHARACTERS 

lor  sa; 

SHAR  561        %    EXTRACT  FIRST  CHARACTER 

SFTE  -I. AND. FIX  CONSIDER  NON-BLANKS#  NON-NULLS#  AND 

SFTEl  -I. AND. Ell   NON-DELIMITERS  ONLY 

RTAL  8' 

Lns  *AI 

CLRAl 

AriMA  «8I       X  SET  PARTIAL  SECTION  CHARACTER 

COUNT  TO  1 

SFTE  J.AND'EJ  I  CONSIDER  ONLY  ALPHABETIC  CHARACTERS 

SFTEl  J,AND.E1' 

DP  SC2;      X  SET  IDENTIFIER  MARKER 

LPEEl  *Clj      *  RE-ENABLE  ALL  P»E»S 

SFTF  Q, AND. El  %    CONSIDER  ONLY  STRINq  CHARACTERS 

SFTEl  G, AND. Ell 

OR  1C3|      %    SET  STRING  MArkEr 

LOEEl  tCH      *  RE-ENABLE  ALL  P.E.S 

SFTF  -H. AND, El*  CONSIDER  ONLY  NON-BLANKS  AND 

SFTEl  •H»AND«El>*  NON-NULL  CHARACTERS 

L.HB  SA> 

LHA  $Sl       X  RETURN  PARTIAL  SECTION  HEADER  TO 

LOS  SBI       I  REGISTER  S 
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SETE 

SFTEl 

STA 

CLRAl 

*T 

ldeei 

mL(O) 

I  LOB 
LDA 
SMAL 
SwAp| 

LOO 

RTAR 

OR 

RTAR 

STA 

CLRAI 

LOS 

XT 

LDEEI 

SETE 

STTEl 

LOB 

IDA 

SHAL 

SHAR 

SETE 

SETE1 

OR 

RTAL 

LOB 

LnA 

LOS 

ADMA 

SETE 

SETE1 

OP 

LOEEl 

SETE 

SFTEl 

OR 

LOEEl 

SETE 

SETE* 

SFTE 

LOB 

LnA 

LOB 

SETE 

SETEl 

STA 

CLRAI 

LOS 

XT 


I, AND. El  s  CONSIDER  ONLY  DELIMITER  SYMBOLS 

I. AND. Ell 

♦S^CTNJ   I  STORE  DELIMITER  In  OUTPUT  STRING 


•  1J 

•  CM 

.CYC56; 
SAJ 

MARKSl 
■0(0)1 

SB' 
#51 
SSJ 

111 

*SECTNJ 

SAl 
■  1) 

*CU 

•H.AND, 
•H.AND, 

sa; 
sri 

■0(0)1 
56; 

-I, AND, 
-I .AND. 

sb; 

81 

SAJ 

SSJ 

SBI 
SJ 

J.ANO.E 
J.AND.E 

SC2; 

SCl| 

G.AND.E 
G.ANDtE 

sc3; 

ten 

•H.AND. 

-I tAND. 

-LAND, 

SAl 

SSI 

SSJ 

I. AND.- 

LAND*" 

♦SEcTNl 

sa; 
■  H 


s  re-Enable  all  p.e.s 

COMPLETE  FIRST  CHARACTER 
S  LOAD  8  STER  8  UNTIL  56  INDEX 
I  MOVE  TO  N"TH  CHARACTER 

*  LOAD  CONTROL  BYTES 

*  SELECT  BYTE  #N 

I  MOVE  CONTROL  BYTE  TO  REqIsTEr  b 
%    LOAD  CONTROL  BYTE  IN  MOOE  REGISTER 

I    ADD  HEADER  TO  P*RTl*L  SECTION 

I  ALIGN  PARTIAL  SECTION  FOR  STORAGE 

I  STORE  PARTIAL  SECTION 

I  CREATE  NEW  EMPTY  PARTIAL  SECTION 

»  CLEAR  CHARACTER  COUNT 

I  ENABLE  ALL  P.E.S 
EM  CONSIDER  ONLY  NON-BLANK  AND 
FlJS  NON-NULL  CHARACTERS 


I  EXTRACT  NEXT  CHARACTER 
EJS  CONSIDER  NON-BLANK#NON-NULL  AND 
EUS  NON-OELIMITER  CHARACTERS  ONLY 

S  ADD  NEW  CHARACTER  TO  PARTIAL  SECTN. 


S 

I  I 

H 
I 
I 

I  I 

u 
% 
I 

EJS 
EJ* 
El 


INCREMENT  CHARACTER  COUNT 
CONSIDER  ONLY  ALPHABETIC  SYMBOLS 

ADD  MARKER  FOR  IDENTIFIER 

RE-ENABLE  ALL  P«E*S 

CONSIDER  ONLY  STRING  CHARACTERS 

ADD  STRING  MARKER 
RE-ENABLE  ALL  P.E.S 

consioer  non-blank,  non-null»  and 

NON-DELIMITER  CHARACTERS  ONLY 


I  RETURN  PARTIAL  SECTION  HEADER 
«   TO  REGISTER  S 

e;s  consioer  only  delimiter  CHARACTERS 

Ell 

*  STORE  DELIMITER  IN  OUTPUT  STRING 
S  CREATE  NEW  EMPTY  PARTIAL  SECTION 
I  CLEAR  CHARACTER  COUNT 


00040100 
00040200 
00040300 
00040400 
00040500 

00040600 
00040700 
00040800 

00040900 
00041000 

000*1100 

00041200 
00041300 
00041400 

00041500 

00041600 
00041700 

00041800 
00041900 
00042000 
00042100 
00047200 
000*2300 
00042400 
00042500 
000*2600 
00042700 
00042800 
00042900 

00043000 
00043100 
000*3200 
00043300 
00043400 
00043500 

00043600 
00043700 
000*3800 
00043900 
00044000 
000**100 
00044200 
00044300 
00044400 

00044500 
00044600 
00044700 

00044800 
00044900 
00045000 

00045100 
00045200 
000*5300 
00045400 
00045500 
000*5600 
00045700 
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LDEE1  SCI)      I  RE-ENABLE  ALL  P.E.S 

TXITM(O)  #REPETI   X  CONTINUE  UNTIL  *  CHARACTERS  HAVE 

*  BEEN  PROCESSED 

*  ****  SECTION  JOINER  -  CHARACTER  COUNT  ROUTINE  **** 
IXE  «8192I    X  SET  I  BIT  If  X  ■  2**13  (FjRSTSTORE 
SFTC(O)  It  X   HARKER  ■  1*    X  ■  0  OTHERWISE) 
IXE  »0l 

S^cd)  Ij 

COR(O)  SC1I      X  ACARO  ■  NOBREAK 

JXG  «4096l    I  SET  J  BIT  If  X  >  2**12  (FlRSTSTORE 
%  X   HARKER  ■  1) 

SFTCC1)  J)        *  ACAR1  ■  FlRSTSTORE 

IsE  «0|       I  SET  I  BIT  IF  THE  P.E,  CONTAINS 

SFTC<2>  II        *   NO  UNSTORED  PARTIAL  SECTIONS 

CnMPC(2)l  *  ACAR2  a  REHAIN 

L0L(3)  JC2) 

CSHRC3)  II 

CANO(O)  $C3| 

C*N0(3)  SC 1 1 

CSHL<D  II 

CANDC2)  SCll 

CANOO  )  SCO|      I  ACAR1  .  LINK 

COMPCCO)! 

CAND(O)  SC2j      X  ACARO  ■  RECEIVER 

LOL(2)  SCl| 

CnMPC(2)l 

CANDC2)  SC3|      I  ACAR2  ■  GENERATOR 

STL(2)  »GENRI    X  SAVE  GENERATOR  PATTERN 

ir>L(3)  .LIM7I    I  SET  ACAR3INCR  ■  *1#  ACAR3LIH  ■  7 

RTAR  #81       I  LEFT  JUSTIFY  PARTIAL  SECTION  TAIL 

STA  SAVEI     I  SAVE  "ENDING"  PARTIAL  SECTION 

STS  SSAVE)    I  SAVE  "ENDING"  HEADER 

LOA  SECTMI    X  LOAD  FIRST  PARTIAL  SECTION  STORED 

LDEEl  SCll      X  ACTIVATE  ONLY  LINK  P.E.S 

LDA  SSI 

SHAL  53|       I  SHIFTED  COUNT  IS  DIVIDED  BY  8 

SFTE  E.OR.-EI  I  CORRECT  STARTING  SECTION  HEADER  IS 

SFTF1  E.OR.-EI  I   NOW  IN  LEFTHOST  BYTF  OF  REGISTER  A 

L08  SAI 

SHAL  21 

SHAR  581 

LOR  SA|       I  STARTING  SECTION  CHArAcTEr  COUNT  IS 
%  NOW  IN  REGISTER  R 

LDA  SBI 

SHAR  62j 

SHAL  61 

LOB  SSI       X  FNOING  SECTION  CHARACTER  COUNT  AND 
%  HArkErs  ArE  now  In  register  b 

LOS  SAI       X  STARTING  SECTION  MARKERS  NOW  IN  RGS 

LHA  SBj 

S  H  A  R  3  I 

RTL  631       X  ROUTE  COUNT  BYTES  LEFT  ONE  P.E, 

AOMA  SRI       X  COHBINE  COUNTS 

MOREl   RTAR  8| 

LHB  SA* 
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SMAR 

OR 

RTL 

AHMA 

TXITM(3) 

RTAR 
LOLO) 

LnR 
Lns 

SMAR 

smal 

RTAR 

Lns 

ShAr 

OR 

RTL 

OP 

TYLTM(3> 

RTAR 

OR 

STA 
LHS 
*  SECTION 
LOL(3) 
STLC3) 
STX 
CLRA) 
LDR 
LOB 

LDA 
SHAL 


561 
SB) 
631 
SRI 

'MORE) 

6) 

.LIM7J 

SS) 

SA) 

62) 

6) 

a) 

SA) 
56) 

SB) 

63) 

SR) 

,MMORE) 

6) 

SS) 

HEAOR) 
SA| 

JOINER  - 
,LlM7) 
.CNTR) 
SAVEXj 

SA) 

SAVE) 

SECTN) 

6) 


LOEE1 

LOA 

LHEE1 

STR 

SETE 
SFTE1 
LOR 
LHA 

SHAL 

Lns 

J« 

SFTCC3) 

CnMPC(l)) 

STL(l) 

CANOCl) 

LnA 

LOEE1 

SHAR 

STA 


SCI) 

SB) 

SC2) 
SECTN) 

E.OR.-E) 
E.OR.-E) 
SA) 
SSAVE) 

3) 

SA) 

57) 

J) 

•NLINK) 

SC3) 

SB| 

SCl) 

6) 

*SECTN) 


I  COMBINE  MARKERS 

I  ROUTE  COUNT  BYTES  LEFT  ONE  P.E, 

f  ADD  IN  NEXT  COUNT  BYTE 

X  REPEAT  SEVEN  TIMES 

I  8  ELEMENT  COUNTER  STRING  COMPLETE 

X  RESET  ACAR3INCR  •  ♦  !•  ACAR3LIM  •  7 

X  SAVE  COUNTER  STRING  IN  REGISTER  S 

x  Ending  section  marks  In  register  a 


X  ROUTE  MARKERS  LETT  ONE  P.E. 

S  COMBINE  MARKERS 

X  REPEAT  SEVEN  TIMES 

S  8  ELEMENT  MARKER  STRING  COMPLETE 

S  FORM  COMPOSITE  MARKER  /  COUNTER 

STRING 
X  SAVE  COMPOSITE  STRING 
X  CHARACTER  COUNT  ROUTINE  COMPLETE 
RECEIVER/LINK/GENERATOR  ROUTINE  **** 
X  SET  ACAR3INCR  ■  *\,    ACAR3LIM  •  7 
X  SET  UP  SECTION  JOINER  CYCLE  COUNTER 
X  SAVE  LOCATION  OF  FIRST  NEW  ENTRY 


X  RELOAD  ENDING  PARTIAL  SECTION 
X  RELOAD  STARTING  SECTION  (IF  ANY) 
X  REMOVE  OLD  HEADER  FROM  STARTING 

SECTION 
X  ACAR1>LINK»  THE  ONLY  CASE  WHERE  AN 

ENDING  PART.  SECTN,  IS  ROUTED  LEFT 

X  ENABLE  ONLY  GENERATOR  P.E.S 
X  CLEAR  STARTING  SECTIONS  IN 

GENERATOR  P.E.S 
X  RE-ENABLE  ALL  P.E.S 

X  SAVE  SECTIONS  TO  BE  ROUTED  LEFT 
X  RETRIEVE  HEADER  OF  ENDING  PARTIAL 

SECTION 
X  MULTIPLY  CHARACTER  COUNTS  BY  EIGHT 

X  LOCATE  8-CHARACTER  PARTIAL  SECTIONS 

X  FORM  -LINK 


X  STORE  ONLY  NON-LlNK#  8-CHARACTER 
X   SECTIONS 


00051500 
00051600 
00051700 
00051600 

00051900 
00052000 
00052100 
00052200 
00052300 
00052400 
00052500 
00052600 
00052700 

00052800 
00052900 
00053000 

00053100 
00053200 
00053300 

00053400 
00053500 
00053600 
00053700 
00053600 
00053900 
00054000 
00054100 
00054200 
00054300 
00054400 
00054500 
00054600 
00054700 
00054800 
00054900 
00055000 
00055100 
00055200 
00055300 
00055400 
00055500 
00055600 
00055700 
00055600 
00055900 
00056000 
00056100 
00056200 
00056300 
00056400 
00056500 
00056600 

00056700 
00056800 
00056900 

00057000 
00057100 
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LOA  IBJ 

XT  all 

SHAL  56* 

CPB(O)  631       «  PREVENT  END-AROUND  ROUTING 

STl(O)  .RCVRI      *  S*vE  PATTERN  OF  rEcEIvEr  p.E.S 

CnMPC(2)l  I  FORM  -GENERATOR 

i!    **♦*  section  Joiner  -  first  concatenation  cycle  **** 

LOEEl  $CO|      X  CONSIDER  ONLY  ACTIVE  RECEIVERS 

LOB  SA* 

RTL  631 

l_nA  SRI      X  load  nExt  element  FROM  THE  RIGHT 

SHAR  #OJ 

OR  SBI       X  ADD  NEXT  ELEMENT  TO  PARTIAL  TOKEN 

SHAR  81        X  MAKE  SPACE  FOR  THE  HEADER 

LHB  SAI 

l_nA  hEAqRI 

JR  601 

LHA  $B| 

STTCd)  Jl        X  CHECK  TO  SEE  If  CURRENT  PARTIAL 

crxOR(3>  SC1J      X   TOKEN  Is  LARGE  ENOUGH  TO  STORE 

CANDC3)  tCO| 

LOEEl  SC3| 

LHL(3)  SC1I 

STA  *SECTN|   X  ADD  PARTIAL  TOKEN  TO  OUTPUT 

XT  "U 

Cl.RAJ 

LOB  SAI 

LOA  SRI 

SHABR  #81       X  PREPARE  LEFT  PART  OF  NEXT  PARTIAL 
1!  TOKEN  IN  REGISTER  B 

CSHL(2)  II 

CAND(O)  SC2|      X  UPDATE  ACTIVE  RECEIVER  INDICATORS 

LOA  SBI 

LOEEl  SCOl 

l_nB  SAI 

LDA  HEADRI    X  RELOAD  CHAR,  COUNT  /  MARKER  STRING 

SHAL  31        X  MULTIPLY  COUNT  BY  EIGHT 

LOS  SAI 

z^rtcO)  #f!nali  x  End  first  concatenation  cycle 

t  #***  SECTION  JOINER  -  EXTRA  CONCATENATION  CYCLES  **** 

CYCLFi  LOEEl  SCO;      X  CONSIDER  ONLY  ACTIVE  RECEIVER  P.E.S 

RTL  63* 

Lr>A  SR'       *  LOAD  NEXT  ELEMENT  FROM  THE  RIGHT 

SHAR  #81 

OP  SBI       X  ADD  NEw  ELEMENT  TO  PARTIAL  TOKEN 

LnB  SA* 

LOA  *Sl 

jr  491       X  SEE  Ip  NEXT  CHARACTER  COUNT  BYTE 
X  ITS  "8"-BIT  »  1 

LDA  IBI 

STTCCl)  Jl        X  CHECK  TO  SEE  If  CURRENT  PARTIAL 

CPXORC35  XCH      *   TOKEN  IS  LARGE  ENOUGH  TO  STORE 

CAN0(3)  SCOl 

LOEEl  SC3| 

LnL(3)  SC1I 
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STA 

XT 

CLRAI 

LDB 

LOA 

SMABR 

LOEEl 

CSHL(2) 

CAND<0) 

LOA 

SMAR 

SHAL 

Lns 

LOA 

EXCHLCO 

TXGFM(0 

HAITI 

EXCHLCO 

ZfRF(O) 

**** 
LOL(O) 
LOEEl 
SHAR 
|lOL<1) 

;lde 

|SFTE1 
SFTE 
|  STA 

XT 
SFTE 

srin 

CLRAl 
STA 
JXE 

SETCO) 
'COMPCO 
STL(3) 
LDX 

LHB 

Ida 

,SHAL 
OR 

Loe 

SFTE1 

.SFTE 

STA 

SFTE 
SFTFl 

■.run) 

.OL(0) 

.ni(2) 


*SEcTNj 
•  ll 

IA' 

SRI 
#8| 

scoi 
U 

SC2I 
SSI 

tu 

31 
IAI 

SBI 

)   .CNTRJ 
)   H 

)   .CNTR' 
*CYctE; 

SECTION  JOI 
,RCVR| 
SCOj 

61 

.NLINKj 

»Clj 

-I. AND, El 
-I.AND.FI 

*sEctni 
■U 

E.OR.-EI 
E.ORt-Ej 

♦sEctni 

•OJ 

JJ 


*  ADD  PARTIAL  TOKEN  TO  OUTPUT 


>l 


.ACTlVI 
SAVEXl 

♦SECTNJ 

SSI 

531 

SBI 

SCl| 
•LAND, 
-LAND, 
♦SECTNI 

E.OR.-E 
E.OR.-E 

**** 
.ACTIVI 
•GENRI 
SCM 


i  prepare  len  part  of  next  partial 
token  in  register  b 

i  update  active  receiver  indicators 

%   shift  to  next  character  count 
i  multiply  count  by  eight 

%   load  cycle  counter 

i  skip  if  more  cycles  are  possible 

i  error  halt  •  too  many  cycles 

*  save  cycle  counter 

i  continue  only  if  active  receivers 

still  remain 
ner  •  final  portion  **** 
x  reload  original  receiver  pattern 
t   use  pattern  to  enable  p.e.s 
%   make  space  for  header 
%   reload  -link 

*  consider  only  (remain  ), ("link  ) 
%   add  last  elements  to  output 

*  re-enable  all  p.e.s 

*  store  zeros  to  mark  end  of  output 

i  save  bits  that  indicate  p.e.s  with 

*  at  least  one  output  section 

s  reload  pointer  to  first  location 

used  by  section  joiner  for  output 
«  reload  first  section  joiner  output 

%   obtain  header 

*  attach  header  to  final  tokens 


El 
El 

%    RETURN  FIRST  ENTRY  WORD  TO  OUTPUT 
I  I    RE-ENABLE  ALL  P.E.S 
I  %    END  OF  SECTION  JOINER  PROCESS 

RESULT  STRING  GENERATOR  **** 

*  FIND  P.E.S  WITH  TOKENS  STORED 


00062900 
00063000 
00063100 
00063200 
OOO63300 
00063400 
00063500 
00063600 

00063700 
00063600 
00063900 
00064000 
00064100 
00064200 

00064300 
00064400 
00064500 

00064600 
00064700 
00064600 
00064900 
00065000 
00065100 
00065200 
00065300 
00065400 
00065500 
00065600 
00065700 
00065800 
00065900 
00066000 

00066100 
00066200 
00066300 

00066400 
00066500 
00066600 
00066700 
00066600 
00066900 
00067000 
00067100 
00067200 
00067300 

00067400 
00067500 
00067600 
00067700 
00067600 
00067900 

00066000 
00066100 

00066200 
00066300 
00066400 
00066500 
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ZFRFC2)  1J 

JUMP  CEASE)  *  JUMP  IF  ALL  TOKENS  PROCESSED 

LEAD0C2>*  *  FIND  NEXT  P,E,  WITH  A  STORED  TOKEN 

CPB(l)  SC2j  I  CLEAR  MARKER  FOR  P#E, FOUND 

CTSBF(O)  1C2#1)  X  SKIP  IF  P.E.  MAD  NO  GENERATOR  SECT* 

SI.ITC2)  «256>  *  INDEX  PAST  FIRST  ENTRY  OF  GEN.  P.E. 

A|_IT(2)  ■QBASEj  I  ADD  256xqBASE  TO  INOFX 

NEXwm  LHAD(2)  .QUERY!  I  LOAD  FIRST  WORD  OF  TOKEN 

LDL(3)  .QUERY; 

ZFRTC3)  #NEXPE;  S  SKIP  IF  NO  TOKENS  LEFT  IN  THIS  P.E. 

AI.IT<2>  -256* 

LHL(O)  SC3j 

CSHLO)  21 

CSHRC3)  58)  I  EXTRACT  CHARACTER  COUNT  OF  TOKEN 

Z*"RT(3)  ,EMITI  X  SKIP  IF  TOKEn  IS  A  DELIMITER 

ALIT(3)  Ml 

CSHLO)  21J  «  FORM  WORD  COUNT  FOR  TOKEN 

SLlT(3)  mOt 

stl(3)  .wrdcti  i  save  word  count 

zfrtc3)  #one*dj  i  skip  if  token  is  a  single  word 

chr(3)  .alimi  x  set  acar3incr  ■  *l*  acar3indx  ■  1 

anthri  ldad<2>  ,query(3)»x  get  additional  token  words 

Al  IT(2)  «256l 

TXLTM(3)  #ANTHR; 

TABLFI  LOL(O)  .QUERY*  I  CHECK  FOR  "PROCEDURE"  RESERVED  WORD 

CFXOR(O)  .PROCDI  I  SKIP  IF  TOKEN  IS  NOT  A  9-CHARACTER 

ZFRF(O)  #ENTERJ  *   IOENT.  STARTING  WITH  "'PROCEDU* 

LDL(O)  ,QUERP>  *  LOAD  SECOND  WORD  OF  OUERY  TOKEN 

CFXOR(O)  ,REj  X    CHECK  SECOND  WORD  OF  QUERY  TOKEN 

ZFRF(O)  #ENTERI  X  SKIP  IF  TOKEN  IS  NOT  "9PR0CEDURE . . » 

LTTfO)  .1281  *  FORM  DUMMY  DELIMITER  FOR  PROCEDURE 

SKIP  #EMITJ  X  OUTPUT  DUMMY  DELIMITER 

ONEwni  CTSBT(O)  1. ENTERI  X  SKIP  IF  TOKEN  IS  A  STRING  TOKEN 

CTSBF(O)  CENTERJ  X  SKIP  IF  TOKEN  IS  A  NUMBER 

LDA  SCO| 

JLE  RESRVl  %    COMPARE  WITH  RESERVED  WORD  TABLE 

ILE  RESRPJ  X  CHECK  SECOND  RESERVEn  WORD  SLICE 

SETC(l)  Jl 

ZFRTCl)  #SEC*DJ  I  SKIP  IF  NO  MATCH  IN  FIRST  SLICE 

LTAOOd)!  %    FIND  NUMBER  OF  MATCHING  RES,  WORD 

ALIT(l)  «64j  X  FORM  DUMMY  DELIMITER  TO 

SKIP  #EMIT'  *   REPLACE  RESERVED  WORD 

SECNDt  SFTCCl)  II 

ZFRT(l)  #ENTER|  X  SKIP  IF  TOKEN  NOT  A  RESERVED  WORD 

LFADOCDI  X  rIND  NUMBER  OF  MATCHING  RES,  WORD 

ALITC1)  -1281  X  FORM  DUMMY  DELIMITER  TO 

SKIP  >EMITJ  X   REPLACE  RESERVED  WORD 

ENTER!  ST|_U)  .ACTIV)  X  SAVE  ACARS 

STL(2)  .POYMTj 

ZFRTCl)  lj  X  SKIP  IF  THIS  LAST  P.E.  WITH  A  TOKEN 

JUMP  SERCH'  *  JUMP  TO  TABLE  MAINTENANCE  ROUTINE 

L0AD(2)  SC3; 

ZFRT(3)  lj  X  SKIP  IF  NO  MORE  TOKEnS  In  THIS  P.E. 

jump  serchl  x  jump  to  table  maintenance  routine 

jump  ceasej  *  jump  if  all  tokens  stored  but  one 
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LOUD 
CANDO  ) 

cnR(O) 

LHL(l) 
LDL(2) 
L0L(3) 
|ST0RE(3) 
| Al  IT<3) 
iSTLO) 
SKIP 
****    T 

ani(O) 

!Lf>L(i) 
line 

Ilha 

SMAR 

MLMA 

ILOC(I) 

i 

ICAND(I) 

Ma 

|L0L<2) 

lj|,E 

ISETCCO) 

zfrf(o) 

liLnL(O) 
SLIT(2) 
ZPRT(O) 
LOLC1) 
'LOL(O) 

Ma 

ISKIP 

'JTVLFMC2  > 
TXLrM(2) 

Lnx 

»!LnL(3) 

in  A 

JLE 

SPTCO) 
CAND(O) 
ZFRT(O) 
TXLFM(2) 
|5KIP 
iLnLO) 

lDA 
JLE 

5FTC<3) 
ZFRTCO) 

•Lfadoco) 

:SHL(1) 

:add(0) 

|5KIP 

MALTI 
****  T 


.QUERY! 
.HDMSKI 

sen 

.ACTIVJ 

•POYNTI 

.OUTPT! 

tcO| 

»il 

.outpti 

#nExwdi 

able  maint 

.QUERY) 
•HASH! 
SCI! 
SCO) 

16; 

SB! 
SA; 

.HHSK! 
SCO! 
.WRDCTI 
TBASE(I) 

J! 

#HTCHll 

.XBASE(1 

■  01 

#A0ST6l 

SCO) 

.QUERY! 

SCO; 

#TST! 

#FOWNOl 

#SHORT) 

TBASP(l) 
,QUERY(2 
SC3) 
*TBASH(2 

SC3! 

#N0YET! 

#F0WND! 

#LONGRI 

.QUERPl 

SC3) 

TBASPU) 

Jl 

#N0YETI 


; 


*  RELOAD  FIRST  WORD  OF  TOKEN 
X  EXTRACT  TOKEN  HEADER 
X  ATTACH  HEADER  TO  RESULT 
I  RELOAD  ACARS 

X  GET  RESULT  STRING  OUTPUT  POINTER 
I  PUT  RESULT  IN  RESULT  STRING 
X  INCREHENT  OUTPUT  POINTER 
X  SAVE  OUTPUT  POINTER 
X  CONTINUE  TO  NEXT  TOKEN 
ENANCE  -  SEARCH  PROCEDURE  **** 
I  LOAD  FIRST  *ORD  OF  QUERY  TOKEN 

S  LOAD  HASH  CODING  CONSTANT  IN  RGB 
S  LOAD  FIRST  WORD  OF  QUERY  in  RGA 
I  SHIFT  QUERY  WORD  TO  THE  RIGHT  OUT 

OF  THE  EXPONENT  FIELD 
X  PERFORM  HASH  CODING  MULTIPLICATION 
S  RETURN  MOST  SIGNIFICANT  HALF  OF 

HASH  COOING  RESULT  TO  ACAR1 
X  CLEAR  ALL  BITS  NOT  IN  4-BIT  KEY 

S  ACAR2LIM>(NUMBER  OF  QUERY  WORDS-1) 
!I  TEST  BLOCK  FQR  MATCHING  FIRST  WORDS 


X  SKIP 
)IS  NO 
X  SET 

SKIP 
USE 
RELO 
INT 
TEST 
SKIP 
SKIP 
!S  LOAD 
)IS  GET 


IF  SOME  FIR 
MATCH  •  LOOK 
WORO  COUNTER 

IF  NO  MORE 
ExT,  BLK,  PO 
AD  FIRST  WOR 
0  REGISTER  A 
FIRST  WOROS 
IF  QUERY  IS 
IF  QUERY  IS 
POINTER  TO 
NEXT  QUERY 


st  word  matches 
for  Extension  blks 
to  first  word 
extension  blocks 
inter  vice  hash  key 
0  of  query  token 

in  new  block 
a  single  word 
two  words  long 
continuation  words 

WORD 


)II  TEST  FOR  MATCH 


6! 

SCl! 

fCOMPLl 


X  SEE  IF  ALL  WOROS  MATCH  SO  FAR 

X  SKIP  IF  NO  ENTRIES  STILL  MATCH 

X  SKIP  IF  ALL  WORDS  OF  QUERY  USED 

X  TEST  NEXT  WORD 

X  GET  SECOND  QUERY  WORD 

IX  TEST  FOR  MATCH 

X  SEE  IF  BOTH  QUERY  WORDS  ARE  MATCHED 

X  SKIP  IF  BOTH  WOROS  ARE  NOT  MATCHED 

X  FIND  NUMBER  OF  P.E.  WITH  MATCH 

X  CONVERT  P.E.  ADDRESS  TO  QUAD  ADDR, 

X  FORM  COMPLETE  RELATIVE  QUAD  ADDRESS 

X  ADDRESS  OF  MATCHING  ENTRY  IN  ACARO 

X  FRROR  HALT  FOR  TABLE  OVERFLOW 


ABLE  MAINTENANCE  •  NEW  ENTRY  PROCEDURE  **** 


00074300 
00074400 
00074500 
00074600 
00074700 
00074800 
0007*900 
00075000 
00075100 

00075200 
00075300 
00075400 
00075500 
00075600 
00075700 
00075600 
00075900 
00076000 
00076100 
00076200 
00076300 
00076400 
00076500 
00076600 

00076700 
00076600 
00076900 
00077000 
00077100 
00077200 
00077300 
00077400 

00077500 
00077600 
00077700 
00077800 
00077900 
00078000 
00078100 
00078200 

00078300 
00076400 
00078500 
00076600 
00078700 
00076800 
00078900 

00079000 
00079100 
00079200 
00079300 
00079400 
00079500 
00079600 
00079700 
00079800 
00079900 
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ADSTGI 


SPACE 


STOWl 


MNYSTI 


MORSTI 


QUIT  t 


CEASTi 


LOL(3) 
CSHR(3) 
LDA 
J!  Z; 

sftc(O) 

GRTRFC3) 

LnB 

LOX 

XT 

LHL(3) 

J*L 

SETC<3> 

ZTRTC3) 

CAND(O) 

7FRF<0> 

L*L(1) 

AI.ITC1) 

GRTRT(1  ) 

LFADOCO)* 
C I.  C  (  3  )  ; 
C<B(3) 

LHEEl 
LnL(O) 

LHA 

STA 

SI  IT(2) 

TYLTM(2) 

TXLFM(2) 

LOL(O) 

LHA 

STA 

SKIP 

SLIT(2) 

mx 

XT 
STX 

LHA 
STA 

TXLTM(2) 

Lnx 

XT 

STX 

SFTE 

SFTEl 

JUMP 

HALTJ 

FND. 


IC2J 

2AJ       I  TRANSFER  ACAR3LIM  TO  ACAR3INDX 
TBASE(1" 

f  FIND  ALL  ZERO  WORDS  (EMPTY  ENTRIES) 
J) 

,ALIM#SPACEJX  SKIP  IF  ENTRY  1  OR  2  WORDS  LONq 
LASTCJ    X  GET  ADDR,  OF  LAST  CONTIN,  WORD  USEO 
IS) 

SC3J      X  CALCULATE  NEW  ADDRESS  OF  LAST  WORD 
.CLIMTJ   X  LOAD  UPPER  LIMIT  FOR  CONTIN,  WORDS 
•1(3);    X  INSURE  UPPER  LIMIT  NOT  EXCEEDED 
J' 

*N0SPCJ   X  SKIP  IF  TABLE  OVERFLOWS 
SC3j 
>STOW*    *  SKIP  IF  SPACE  IS  ALREADY  AVAILABLE 

.lastx; 

■  2J 

,XLIMT#NOSPC'I  SKIP  IF  NO  SPACE  CAN  BE  FOUND 

tLAsTX|   X  UPDATE  LAST  EXTENSION  BLOCK  ADDRESS 

X  rIND  NUMBER  OF  P.E.  WITH  MATCH 


$COj 
SC3J 

.QUERY; 

SCO; 
TBASC(1 

■  Oj 

#MNYSTJ 
#QUIT) 
.QUERPJ 
SCOI 
TBASP(1 
'QUIT* 

■  U 

$s; 

■  U 

TBASP(1 
.OUERYC 

SCO; 

#TBAsE( 

*morst; 

SSJ 

*C2; 

LASTC* 
E.OR.-E 
E.ORt-E 
COMPL' 


X  FNABLE  SELECTED  P.E,  TO  STORE  ENTRY 
X  LOAD  FIRST  WORD  OF  QUERY 

)JX  STORE  FIRST  WORD 
X  SET  ACAR2INDX  ■  0 

X  SKIP  IF  ENTRY  HAS  MORE  THAN  2  WORDS 
X  SKIP  IF  ENTRY  IS  A  SINGLE  WORD 
X  LOAD  SECOND  WORD  OF  QUERY 

)JX  STORE  SECOND  WORD  OF  ENTRY 

X  ROTH  QUERY  WOROS  ARE  NOW  STORED 

X  RESET  ACAR2INDX  ■  1 

X  RGS  CONTAINS  LASTC*  THE  ADDRESS  Of 
THE  LAST  CONTINUATION  WORD  USED 

X  INCREMENT  CONTINUATION  WORD  ADDRESS 
)JX  STORE  POINTER  TO  LATER  ENTRY  WORDS 
2)»X  GET  NEXT  QUERY  WORD  TO  BE  STORED 

2)JX  STORE  QUERY  IN  CONTINUATION  WORD 

X  SKIP  IF  MORE  WORDS  TO  STORE 

X  GET  OLD  VALUE  OF  LASTC 

X  UPOATE  LASTC 

X  SAVE  LAST  CONTINUATION  WORD  ADDRESS 
J  X  RE-ENABLE  ALL  P.E.S 


X  TABLE  MAINTENANCE  COMPLETE 
X  FND  OF  RECOGNIZER  **** 


ocfi 


'9*3 


fc 


